Reliability - this is what shows the quality of information, reflects its completeness and accuracy. It has such signs as legibility of written and spoken language, the absence of false or in any way distorted information, a small possibility of erroneous use of information units, including letters, symbols, bits, numbers. The accuracy of the information and its source are also evaluated directly on a scale (for example, “predominantly reliable”, “reliable in full”, “relatively reliable”, and then to “completely unreliable” or “status not defined”).
What does this mean?
Reliability characterizes the undistortedness of information. It is influenced not only by the authenticity of the information, but also by the adequacy of the ways in which it was obtained.
Inaccuracy, however, may mean intentionally preparing data as false. There are cases when inaccurate information as a result provides information characterized by reliability. This happens when, at the time of their receipt, the degree of uncertainty of the information is already known to the addressee. In general, the following pattern is observed: the higher the amount of source data, the higher the reliability of the information becomes.
Adequacy of Information
Thus, reliability is directly related to the adequacy of information, its completeness and objectivity. This property is of very serious importance, mainly in the case of applying data to make any decisions. Information, which is not reliable, leads to such decisions that will have negative consequences in terms of social arrangement, political situation or economic situation.
So, we consider in more detail the concept of reliability of information.
Definition of concepts of reliable and false information
So, information is false if it does not correspond to the real state of things, it contains such data on phenomena, processes or events that, in principle, never existed or existed, but information about them differs from what is actually happening, is distorted or characterized by incompleteness.
Reliable can be called such information, which causes absolutely no doubt, is real, genuine. It includes such information, which, if necessary, can be confirmed by procedures that are legally correct, when various documents or expert opinions are used, witnesses can be invited, etc. In addition, the data can be considered reliable if they necessarily refer to primary source. However, in this case, the problem of determining the reliability of the source of information arises.
Types of information sources
Sources of information may be:
- individuals who, due to their authority or position, have access to such information that interests various kinds of media,
- the real environment (for example, urban, subject matter, which is the human habitat, natural),
- print media that have imprints, i.e. textbooks, books, encyclopedias or journal articles,
- Internet sites, portals, pages on which the media can also be based.
Undoubtedly, one of the most authoritative and safest sources is documents, but they are considered to be such only when there is the possibility of legal verification. They are characterized by the entirety of the information.
Competent and incompetent
In addition to subdividing into reliable and unreliable, sources can also be competent and incompetent.
The most widely represented sources of information are those authorized by official authorities. First of all, state institutions should provide citizens with the most objective and accurate information. However, even the information of the press service of the government can be falsified, and there is no guarantee that information that is not reliable cannot be leaked from a state source. That is why receiving information does not mean trusting it unconditionally.
Since only information that is relevant to reality is reliable, the skill of checking the data and determining the degree of their reliability is very important. If you master this skill, you can avoid all kinds of misinformation traps. To do this, first of all, it is necessary to identify what semantic load the received information has: factor or estimated.
Monitoring the accuracy of information is extremely important. Facts are what a person encounters in the first place when he receives any information new to him. They refer to information already verified for reliability. If the information has not been verified or it is impossible to do, then it does not contain facts. These include numbers, events, names, dates. Also a fact is that you can measure, confirm, touch or list. Most often, sociological and research institutes, agencies specializing in statistics, etc. have the opportunity to present them. The main feature that distinguishes between a fact and an assessment of the reliability of information is the objectivity of the first. Evaluation is always a reflection of one's subjective gaze or emotional attitude, and also calls for certain actions.
Differentiation of sources of information and their comparison
In addition, it is important when obtaining information to distinguish between its sources. Since the overwhelming majority of facts is unlikely to be independently verified, the reliability of the data obtained is considered from the standpoint of trust in the sources that provided them. How to check the information source? The main factor determining the truth is considered practice, or what acts as an assistant in the performance of a specific task. The dominant criterion of any information is also its effectiveness, which is shown by the number of subjects who applied this information. The higher it is, the more confidence they will have in the data received, and their reliability is higher. This is the basic principle of the reliability of information.
In addition, it will be quite useful to compare sources among themselves, since qualities such as credibility and popularity do not yet provide full guarantees of reliability. That is why the next important sign of information is its consistency. Each fact received from the source must be proved by the results of independent studies, that is, it must be repeated. If a reanalysis comes to the same conclusions, then it is established that the information is indeed consistent. This suggests that the information of a single character, random, does not deserve much confidence in itself.
The following proportion is observed: the greater the amount of such information derived from various sources, the higher their degree of reliability of the information. Each source is responsible for the facts provided, not only in terms of morality, but also in terms of materiality. If any organization provides data of doubtful origin, then it can easily lose its reputation, and sometimes even the means to ensure its existence. In addition, you can not only lose the recipients of information, but even be punished by a fine or imprisonment. That is why reputable sources with certain authority will in no way risk their own reputation by publishing false information.
What to do if a specific individual becomes a source of information?
There are situations when the source of information is not an organization, but a certain person. In these cases, it is necessary to find out as much information as possible about this author in order to determine the degree to which the information received from him should be trusted. You can verify the reliability of the data by familiarizing yourself with other works of the author, with his sources (if any), or by finding out whether he has speech freedom, that is, whether he can provide such information.
This criterion is determined by the presence of his academic degree or due experience in a certain field, as well as the position he occupies. Otherwise, the information may well be useless and even harmful. If you cannot verify in any way the reliability of the information, they can immediately be considered meaningless. When searching for information, first of all, it is necessary to clearly articulate the problem that needs to be resolved, which will reduce the possibility of misinformation.
If the information is anonymous, then in no case can you guarantee the accuracy of the information. Any information should have its own author and be supported by his reputation. In principle, the most valuable data are those whose source is an experienced person, not a random one.
How to evaluate the reliability of search results?
In my yesterday’s article “An instructive story on how important it is to really be able to use Google’s, told by Albert Einstein in 1954,” I wrote how important it is to be able to evaluate the reliability of the results that a search query brings us. After searching on this topic, I was surprised to find that almost no one was doing this. A rare exception is Igor Ashmanov, whose work is largely based on the conclusions given below. So, we are faced with the task: to find information on a given topic.
The art of searching consists of two parts:
- Proper construction of a search query
- Evaluation of search results and selection of the most relevant results
The first paragraph is devoted to a lot of materials. But most sources reduce everything to the technical side of the issue - the technology for the correct use of the language of search queries of this particular search system. Knowledge of the mechanism, of course, can significantly improve the relevance of search results, but under one single condition - if the search query is correctly specified. Alas, if you make a mistake at this stage, you can lose a lot of time for clarification and as a result it can often turn out to be easier to start all over again. As Mikhail Talantov wrote back in 1999:
It is usually necessary to start with a comprehensive lexical analysis of the information to be searched. It is necessary to obtain from any source the precedent of a detailed and competent description of the issue under study. Such a source may well be a highly specialized reference book, as well as an electronic encyclopedia of a general profile. Based on the material studied, it is necessary to form the widest possible set of keywords in the form of separate terms, phrases, professional vocabulary and cliche, if necessary, in several languages. You should worry in advance about the potential for refinement of your search query - rare words, possibly names and surnames that are closely related to the problem. It is also advisable to anticipate which of the selected terms can bring irrelevant documents to the response of search engines. After accumulating this baggage, you can proceed to obtain preliminary information from the Network.
Since 1999, the Internet has changed, but the principles have remained the same.
However, in this article I would like to pay more attention to the second part of the art of search, namely, how we can evaluate the relevance of the results that the search engine gives us on our request.
There are many ways to evaluate how well the documents found by the search engine match the query. Unfortunately, the concept of the degree of compliance of a request, or in other words, relevance, is a subjective concept, and the degree of compliance depends on the individual who evaluates the results of the request. So in each case, the relevance assessment should be carried out individually according to the criteria corresponding to the purpose of the study.
It seems that the correct approach to assessing relevance would be to follow the classic journalistic approach to validating information:
- confirmation of it from at least two independent sources,
- checking the disinterest of the source of information in its content,
- comparing the information received with already known on this topic,
- Validation of the information received from reputable experts,
- requesting from the source of information additional details confirming the truth of the main message.
In the case of an Internet search, this means the following:
- Having found some information on the Web, check if there is evidence of its existence in independent sources. The ease of publishing on the Internet leads to the very frivolous handling of information by many bloggers, and often network journalists. If the information found is confirmed by several sites, you need to check whether they are clones. Alas, most of the “fakes” today are thrown at once onto several resources precisely in order to overcome this first barrier. On the other hand, if you suddenly find that the information found does not match the information that is already on the network, the question arises - which one should be considered reliable? When I prepare arguments for my discussions, I very often encounter situations where, it would seem, for granted, the facts turn out to be myths. It is important to be able to rebuild in time and rebuild your arguments in connection with newly discovered circumstances.
- The most important thing when checking the quality of information is checking the disinterest of the source of information in its content. First of all, today you need to pay attention to the presence of paid links and explicitly biased materials (usually they can be “calculated” by a not very carefully executed logical connection to the context or bias of the presentation). It is also important to avoid taking information from "garbage sites" whose authors publish information indiscriminately and are not responsible for its quality. A sign of such a site may be the heterogeneity of the published information, its stylistic and thematic mosaic. We simply ignore sites whose content is generated in automatic and semi-automatic mode. An important argument is to understand if the information is not contradictory in itself. Contradictions, as a rule, are the result of either negligence, or conscious or unconscious misinformation
- The second most important condition for the selection of quality information is a comparison with already known data on this topic. Real sensations are rare. If the information is plausible, the chances of it being reliable are much greater than that of a sensationally scandalous (if you see a teaser on the news site with the text “Just enter ANY SURNAME and in a minute find out everything about the person.” Common sense should tell you that in fact, you will not get exactly what you expect.
- Check with reputable experts. If the material you found seems so interesting to you that it is worth your efforts, but you are not an expert on this issue, try to find the opinion of specialists on this issue. How this is done, I told in detail yesterday on the example of a story with a Pritchard filter. It is worth noting that if experts have not found discussions of this issue, you can initiate such a discussion yourself. It is only important to choose the right site.
- In the end, publications on the Internet are not made by the priests of the pharaoh Amenhotep, who cannot be contacted in any way! Do not forget about the simple way - simply write to the author. unless, of course, he indicated his contact details on the site. However, if there are none, you have another reason to doubt the reliability of the information provided.
However, for most readers all this so far is just common words. The criteria for the reliability of the information here are very individual and, as a result, very blurred.
Let's move on to more specific examples.
Igor Ashmanov considers the assessment of the relevance of the data obtained by the example of the popular Q & A service. How to evaluate which of the answers is most relevant to your question?
Assessing the reliability of the data proposed in the response is the most difficult task of assessing the quality of the response. An expert may well be mistaken or deliberately mislead. Therefore, it is necessary not only to evaluate the proposed information, but also its source. In this regard, it is worth introducing two more terms: authorization and authority.
Authorization — это привязка предлагаемых в ответе данных к известному источнику. В качестве источника могут быть использованы ссылка на сайт, книга и др.
Авторитетность — характеристика источника данных. If the expert in the answer does not refer to the source, but speaks in the first person, the authority of the expert himself is assessed
With regard to search results for us, the relevance criteria for the material found may be a link to a known source.
A typical example: yesterday, a friend of mine posted a note on the Vkontakte social network that back in 1988, SCIENCE AND LIFE magazine published an article stating that Apple conducted a survey among students about what they think a computer should be like in 2000 year. And according to the answers, I made a collective portrait of an “ideal computer”, like two drops of water similar to ... iPad.
Friends immediately shouted “Fake! Fake! This cannot be, because it can never be! ”
However, a scan of the corresponding page of the journal was immediately presented, and the question was settled. By the way, the article is really interesting - you should read it today, in the year 2011.
Having a link to an authoritative source removes the problem. At least half. Because, as Ashmanov correctly noted, "an expert may well be mistaken or deliberately mislead." After all, no one canceled Photojacks (:-)
Igor Ashmanov examines in detail how you can verify the expert’s qualifications (again using the “Questions and Answers” example, but his conclusions are also applicable to the analysis of blogs in which we find information.
Of course, the cited material is intended more for experts so that they themselves can look at themselves from the side. But for us, the consumer side of the issue is also important.
And - the last. Is it possible to make such a machine that - BAM! - and she told us - is this site relevant, and this one is a solid fake?
The question is formulated mockingly ridiculously, but the world is seriously concerned about the creation of automatic algorithms for assessing the reliability of websites:
Andreas Juffinger, a researcher at the Austrian science and technology center Know-Center, notes that this problem is especially relevant because of the proliferation of blogs in which everyone can write anything. Mr. Juffinger and his colleagues are working on a program that will analyze web diaries and automatically rank them by degree of reliability. To this end, the software will examine the statistical properties of sites - for example, the frequency of use of certain words per unit time - and compare them with other news resources that have already earned trust.
“The results are promising, we are on the right track,” said Andreas Yuffinger at the international conference WWW 2009, which is being held this week in Madrid. “The assessment of the reliability of sites cannot but be automatic, because readers are not able to compare all blogs with each other.”
Is it possible to create such an automatic system?
Firstly, it is not clear who and in what procedure will select the reference texts and whose point of view will they reflect? Somehow in Iran, the texts of Ahmadinejad will be taken as the reference, and all kinds of "statistically disagreeing" with the president will be identified and cleaned up ...
And secondly, who and how, without semantics, will decide which reference text should be correlated.
For example, how does the “statistical analyzer” understand that the material “Nemtsov crept unnoticed” is about the elections in Sochi? And if the material concerns Ivanov (we have only a few in the upper echelons of public policy), then how to determine without a semantics with the text about which Ivanov should be compared. And what about the absolutely reliable police report about “Citizen Ivanov violated the curbstone. After numerous exhortations, on my part, I stopped, but not because I realized, but because I was exhausted ”? Sifting out like a soiling bright appearance? ...
And finally, in the third. The experience of fitting websites and documents to the criteria of a rating statistic rating of search engines shows that not the most relevant and not the best materials fall into the top ... What, the authors of published materials will not be able to arrange the “fit” to the requirements of the analyzer? Customize ....
Nope, shy. Without semantics, it all “does not dance”.
And who today can perform the function of a semantic reliability analyzer?
The person who was taught this. The librarian is called
In general, no matter what they say, it all depends on the person. From his ability to separate the grains from the chaff and to isolate in the muddy stream of information retrieval those crumbs that make up real knowledge.