Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Introduction to Informatics: The Study of Computers and People, Lecture notes of Logic

An overview of Informatics, a multidisciplinary field that combines aspects of software engineering, human-computer interaction, and the study of organizations and information technology. It covers the history, applications, and branches of Informatics, as well as its relationship with fundamental sciences such as mathematics, physics, and electronics.

What you will learn

  • How does Informatics relate to other scientific disciplines?
  • What is Informatics and what are its main branches?
  • What are some real-world applications of Informatics?
  • How did Informatics originate and develop?
  • What are the main areas of interest for computer scientists?

Typology: Lecture notes

2021/2022

Uploaded on 09/27/2022

nath
nath 🇬🇧

4.9

(8)

257 documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1 Introduction to Informatics
1.1 Basics of Informatics
Informatics is a very young scientific discipline and academic field. The interpretation
of the term (in the sense used in modern European scientific literature) has not yet
been established and generally accepted [1]. The homeland of most modern computers
and computer technologies is located in the United States of America which is why
American terminology in informatics is intertwined with that of Europe. The American
term computer science is considered a synonym of the term informatics, but these
two terms have a different history, somewhat different meanings, and are the roots
of concept trees filled with different terminology. While a specialist in the field of
computer science is called a computer engineer, a practitioner of informatics may be
called an informatician.
The history of the term computer science begins in the year 1959, when Louis Fein
advocated the creation of the first Graduate School of Computer Science which would be
similar to Harvard Business School. In justifying the name of the school, he referred to
management science, which like computer science has an applied and interdisciplinary
nature, and has characteristics of an academic discipline. Despite its name (computer
science), most of the scientific areas related to the computer do not include the study
of computers themselves. As a result, several alternative names have been proposed in
the English speaking world, e.g., some faculties of major universities prefer the term
computing science to emphasize the difference between the terms. Peter Naur suggested
the Scandinavian term datalogy, to reflect the fact that the scientific discipline operates
and handles data, although not necessarily with the use of computers. Karl Steinbuch
introduced in 1957 the German term informatik, Philippe Dreyfus introduced in 1962 the
French term informatique. The English term informatics was coined as a combination of
two words: information and automation; originally, it described the science of automatic
processing of information. The central notion of informatics was the transformation of
information, which takes place through computation and communication by organisms
and artifacts. Transformations of information enable its use for decision-making.
Q&A
What is
informatics?
Technical definition: Informatics involves the practice of information systems
engineering, and of information processing. It studies the structure, behavior, and
interactions of natural and artificial systems that collect, generate, store, process,
transmit and present information. Informatics combines aspects of software
engineering, human-computer interaction, and the study of organizations and
information technology; one can say it studies computers and people. In Europe, the
same term informatics is often used for computer science (which studies computers
and computer technologies).
Business definition: Informatics is a discipline that combines into one-field
information technologies, computer science and business administration.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Introduction to Informatics: The Study of Computers and People and more Lecture notes Logic in PDF only on Docsity!

1 Introduction to Informatics

1.1 Basics of Informatics

Informatics is a very young scientific discipline and academic field. The interpretation of the term (in the sense used in modern European scientific literature) has not yet been established and generally accepted [1]. The homeland of most modern computers and computer technologies is located in the United States of America which is why American terminology in informatics is intertwined with that of Europe. The American term computer science is considered a synonym of the term informatics, but these two terms have a different history, somewhat different meanings, and are the roots of concept trees filled with different terminology. While a specialist in the field of computer science is called a computer engineer, a practitioner of informatics may be called an informatician. The history of the term computer science begins in the year 1959, when Louis Fein advocated the creation of the first Graduate School of Computer Science which would be similar to Harvard Business School. In justifying the name of the school, he referred to management science, which like computer science has an applied and interdisciplinary nature, and has characteristics of an academic discipline. Despite its name (computer science), most of the scientific areas related to the computer do not include the study of computers themselves. As a result, several alternative names have been proposed in the English speaking world, e.g., some faculties of major universities prefer the term computing science to emphasize the difference between the terms. Peter Naur suggested the Scandinavian term datalogy , to reflect the fact that the scientific discipline operates and handles data, although not necessarily with the use of computers. Karl Steinbuch introduced in 1957 the German term informatik , Philippe Dreyfus introduced in 1962 the French term informatique. The English term informatics was coined as a combination of two words: information and automation ; originally, it described the science of automatic processing of information. The central notion of informatics was the transformation of information , which takes place through computation and communication by organisms and artifacts. Transformations of information enable its use for decision-making.

Q&A What is informatics?

Technical definition: Informatics involves the practice of information systems engineering, and of information processing. It studies the structure, behavior, and interactions of natural and artificial systems that collect, generate, store, process, transmit and present information. Informatics combines aspects of software engineering, human-computer interaction, and the study of organizations and information technology; one can say it studies computers and people. In Europe, the same term informatics is often used for computer science (which studies computers and computer technologies). Business definition: Informatics is a discipline that combines into one-field information technologies, computer science and business administration.

A number of experts in the field of computer science have argued that in computer science there are three distinct paradigms. According to Peter Wegner, these three paradigms are science, technology and mathematics. According to Peter J. Denning, these are theory, modeling and design. Amnon H. Eden described these as rationalist, technocratic, and scientific paradigms. In his opinion, in frames of rationalist paradigm the computer science is a branch of mathematics; mathematics dominates in theoretical computer science and mainly uses the logical conclusion. The technocratic paradigm is the most important in software engineering. In the frame of a scientific paradigm, computer science is a branch of empirical science, but differs in that it carries out experiments on artificial objects − software and hardware.

Figure 1: Foundations of computer science and informatics.

An overview of computer science can be borrowed from Charles Chen [2]. He referred to mathematical, theoretical, and practical (application) branches as key components. The mathematical branch is devoted to systems modeling and creating applications for solving math problems. Some related principles of study are classical and applied mathematics, linear algebra and number theory. The theoretical branch covers algorithms, languages, compilers and data structures. This branch is based on

2 Introduction to Informatics

4 Introduction to Informatics

the first electronic computer. Others prefer to date it from the first electric (and even mechanical) calculators. As it always happens in such situations, there are many opinions and justifications for each point of view. If we take 1950s as the beginning of the history of modern informatics, all previous events (important in the context of information processing science) are considered prehistoric. There were some very important inventions and discoveries during prehistory, which allow us to trace the historical logic of the creation of modern information technologies. To understand the roots of informatics, one should look at the history of computer technology [5], which includes a multitude of diverse devices and architectures. The abacus from ancient Babylon (300 BC) and China (500 BC) are the oldest known historical examples. The Jacquard loom invented by Joseph Marie Jacquard (1805) and the analytical engine invented by Charles Babbage (1834) are the first examples of zero generation (prehistoric) computers − mechanical machines designed to automate complex calculations. De facto, Babbage’s engine was also the first multi- purpose programmable computing device. After him, George and Edvard Scheutz (1853) constructed a smaller machine that could process 15-digit numbers and calculate fourth-order differences. Ada Lovelace (1815-52) collaborated with Charles Babbage. She is said to be the first programmer. She saw the potential of the computer over a century before it was created. It took more than one century, until the end of the 1960s, when mechanical devices (e.g. the Marchant calculator) found widespread application in engineering and science. First generation electronic computers (1937-1953) used electronic components based on vacuum tubes. This was the period when both digital and analog electric computers were developed in parallel. The first electronic digital computer was ABC, invented by John Vincent Atanasoff at Iowa State University. The second early electronic machine was Colossus, designed by Alan Turing (1943) for the British military. The first general purpose programmable electronic computer was ENIAC (1943-1945), built by J. Presper Eckert and John V. Mauchly at the University of Pennsylvania. In fact, as late as the 1960s, analog computers still were used to solve systems of finite difference equations. Nevertheless, digital computing devices won the competition, because they proved to be more useful when dealing with large-scale computations (more computing power, more scalable and economical). Software technology during this period practically did not exist; programs were written out in machine code. Only in the 1950s a symbolic notation known as assembly language, began to be used. Second generation computers (1954-1962) were based on semiconductor elements (discrete diodes and transistors) with a very short switching time and randomly accessed magnetic memory elements. There was a new opportunity to perform calculations in the format of real numbers (floating point). High-level programming languages such as FORTRAN, ALGOL and COBOL were created during this time. These changes enabled the production of the first computers designed not only for science and the military, but also for commerce. The first two supercomputers (LARC and IBM

Basics of Informatics 5

  1. were designed during the same period. These were machines that had much more calculating power than others and could perform parallel processing. The key changes in the third generation of computers (1963-1972) concerned the use of integrated electronic circuits instead of discrete elements and the use of semiconductor memory instead of magnetic cores. Changes in computer architecture were associated with the spread of operating systems and parallel processing techniques. These developments led to a revolutionary increase in the speed of execution of calculations. The key figure for the development of computer technology in this period was Seymour Roger Cray who established the supercomputer industry through his new architecture approaches. His computers attained for the first time a computation rate of 10 million floating-point operations per second (10 Mflops). At the same time (1963) Cambridge and the University of London developed in cooperation Combined Programming Language (CPL), which became the prototype for a number of other major programming languages. Early implementations of the excellent operating system − UNIX were based on the language B, which is derived from the CPL. The next (fourth) generation of computer systems (1972-1984) went through the use of large and very large scale integration of electronic components on chips. This has enabled an entire processor and even an entire simple computer to fit onto a single chip. That very important change allowed for a reduction in the time required to execute basic calculations. Most systems acquired a semiconductor main memory, which also increased the speed of information processing. There were two very important events at the beginning of the fourth generation: the development at Bell Labs of the C programming language and the UNIX operating system (written on C). In a short time, UNIX was ported to all significant computers; thousands of users were exempted from the obligation to learn a new operating system each time they changed computer hardware. The new trend in software methodology during the fourth generation was the introduction of a declarative style of programming (Prolog) next to an imperative style (C, FORTRAN). The fifth generation of computer technology (1984-1990) was characterized mainly by the widespread adoption of parallel processing. This means that a group of processors can be working on different parts of the same program. At that time, fast semiconductor memories and vector processors became standard on all computers, this was made possible by rapid growth of the scale of integration; in 1990, it was possible to build integrated circuits with a million components in one chip. Thus, all the processors participating in parallel calculations may be located on the same PC board. Another new trend was the common use of computer networks; connected to the network, single-user workstations created the conditions for individual work at the computer in contrast to the previous style of work in a group. The latest, i.e. the sixth generation of computers (1990- ) is distinguished by the exponential growth of wide-area networking. Nowadays, there is a new category of computers (netbooks), which are useful only when using the Internet. Another

Basics of Informatics 7

Software designer Tim Paterson is the original author (1980) of MS-DOS; de facto, Bill Gates only rebranded his operation system known at first as QDOS (Quick and Dirty Operating System). Paterson worked for Microsoft on MSX-DOS and on Visual Basic projects. Daniel Singer and Robert M. Frankston are the creators of the first spreadsheet computer program VisiCalc. It was historically the first program that has transformed microcomputers from a plaything into the professional tools needed to perform specialized calculations. Robert Elliot Kahn and Vinton Gray Cerf are the inventors of two basic communication languages or protocols of the Internet: TCP (Transmission Control Protocol) and IP (Internet Protocol). Both, TCP and IP are currently in widespread use on commercial and private networks. Niklaus Wirth is known as the chief designer of several programming languages, including Pascal. He has written very influential books on the theory and practice of structured programming and software engineering. James Gosling invented (1995) the Java programming language, which become popular during the last decade. Compiled Java applications typically run on Java virtual machines regardless of computer architecture. Independence from the computer architecture has been achieved in such a way. The inventor of the World Wide Web (WWW, an internet-based hypermedia initiative for global information sharing), Timothy John Berners-Lee implemented in 1989 the first successful client-server communication via the Internet using Hypertext Transfer Protocol (HTTP). Nowadays he is the director of the W3C − the main international standards organization for the WWW. Linus Benedict Torvalds created Linux − the kernel of the operating system GNU/ Linux, which is currently the most common of free operating systems. Currently, only about two percent of the current system kernel has been written by Torvalds, but he remains the decision maker on amending the official Linux kernel tree. It is not possible to mention here all who have contributed to the development of modern computer technology. There have been, and will continue to be, many others.

1.1.3 Areas of Computer Science

Computer science combines a scientific and practical approach. On the one hand, computer scientists specialize in the design of computational and information systems, on the other hand they deal with the theory of computation and information. To understand the areas of interest of computer scientists, we must analyze from where they or their teachers came to computer science: from electronics, from mathematics, from physics and so on.

8 Introduction to Informatics

In 2008 ACM Special Interest Group on Algorithms and Computation Theory published its vision of theoretical computer science themes that have the potential for a major impact in the future of computing and which require long- term, fundamental research. These are the issues of algorithms, data structures, combinatorics, randomness, complexity, coding, logic, cryptography, distributed computing, networks among others. Of course, some central research areas were not represented here at all, but vision can help us to understand the field of computer science theory. Let us browse theoretical and applied areas, which are relatively stable and are represented in computer science curricula. Generally, theoretical sciences form a subset of wide-ranging computer science and mathematics. This is why all of them focus on the mathematical aspects of computing and on the construction of mathematical models of different aspects of computation. Someone who loves mathematics, can find among these disciplines his/her preferred area of interests. Applied sciences are directly related to the practical use of computers. The following lists contain names of the most popular theoretical and applied disciplines belonging to area of computer science:

List 1. Theoretical computer science List 2. Applied computer science Algorithms and data structures Algorithmic number theory Computer algebra or Symbolic computation Formal methods Information and coding theory Programming language theory Program semantics The theory of computation Automata theory Computability theory Computational complexity theory Formal language theory

Artificial intelligence Computer architecture and engineering Computer Performance Analysis Computer graphics and visualization Computer security and cryptography Computational science Computer networks Concurrent, parallel and distributed systems Databases Health informatics Information science Software engineering

1.1.4 Theoretical and Applied Informatics

Before the end of 1970s, cybernetics was the dominant term among sciences related to the processing of information; respectively, theoretical informatics was named mathematical (theoretical) cybernetics. Briefly, theoretical informatics is a mathematical discipline, which uses methods of mathematics to construct, and study models of processing, transmission and use of information. Theoretical informatics creates the basis on which whole edifice of informatics is build. By its very nature,

10 Introduction to Informatics

a boundary position between theoretical informatics and cybernetics. The same boundary position is occupied by two more disciplines. Simulation is one of them; this science develops and uses special techniques for the reproduction of real processes on computers. The second science is queuing theory, which is the mathematical study of a special, but very broad class of models of information transmission and information processing, the so-called queuing system. Generally, models of queuing systems are constructed to predict queue lengths and waiting times.

  1. The Theory of Games and Special Behavior. The last class of disciplines included in theoretical informatics focuses on the use of information for decision-making in a variety of situations encountered in the world. It primarily includes decision theory , which studies the general scheme used by people when choosing their solutions from a variety of alternatives. Such a choice is often the case in situations of conflict or confrontation; that is why models of this type are studied by game theory. A decision maker always wants to choose the best of all possible solutions. Problems which arise in a choice situation are studied in the discipline known as mathematical optimization also known as mathematical programming. To achieve goals, the decision making process must obey a single plan; the study of ways of building and using such plans is provided by another scientific discipline − operations research. If not individual, but team decision are made, there are many specific situations, e.g., the emergence of parties, coalitions, agreements and compromises. The problem of collective decision-making is examined in the theory of collective behavior.

1.1.4.2 Applied Informatics Cybernetics, which has already been mentioned, can be seen as the first applied informatics discipline concerned with the development and use of automatic control systems of diverse complexity. Cybernetics originated in the late 40s, when Norbert Wiener first put forward the idea that the control system in living and non-living artificial systems share many similarities. The discovery of this analogy promised the foundation of a general theory of control, the models of which could be used in newly- designed systems. This hypothesis has not withstood the test of time, but principles concerning information management systems have greatly benefited. Technical cybernetics was the most fully developed. It includes automatic control theory , which became the theoretical foundation of automation. As the second applied informatics discipline, programming relies completely on computers for its appearance. In the initial period of its development, programming lacked a strong theoretical base and resembled the work of the best craftsmen. With experience, programming has groped towards general ideas that underlie the construction of computer programs and programming arrangements themselves. This has resulted in the gradual establishment of theoretical programming, which

Basics of Informatics 11

now consists of multiple destinations. One of them is connected with the creation of a variety of programming languages designed to make human interaction with computers and information systems easy. Artificial intelligence is the youngest discipline of applied informatics, but now it determines the strategic directions of the development of information sciences. Artificial intelligence is closely related to theoretical computer science, from which it has borrowed many models and methods, such as the active use of logical tools to convert knowledge. Equally strong is its relation to cybernetics. The main objective of work in the field of artificial intelligence is the desire to penetrate the secrets of creative activity of people, their ability to master skills, knowledge and abilities. To do this, you need to open those fundamental mechanisms by which a person is able to learn almost any kind of activity. Such a goal of researchers in the field of artificial intelligence closely associates them with the achievements of psychology. In psychology, there is now a new area actively developing − cognitive psychology , which focuses exactly on examining the laws and mechanisms that interest specialists in the field of artificial intelligence. The sphere of interests of experts in the field of artificial intelligence also includes linguistic studies. Mathematical and applied linguistics also work closely with research in the field of artificial systems designed for natural language communication. Artificial intelligence is not a purely theoretical science; it touches on the applied issues related to the construction of real existing intelligent systems, such as robots. The significant practical importance of informatics manifests itself in the field of information systems. This trend was created by researchers in the field of documentology (the scientific study of documents, including the examination of their structure) and by the analysis of scientific and technical information that were conducted even before computers. However, true success of information systems was reached only when computers become part of their composition. Now within this area, a few basic problems are being solved. The analysis and forecasting of various information streams, the study of methods of information presentation and storage, the construction of procedures to automate the process of extracting information from documents, and the creation of information search systems. On the one hand, research in the field of information systems is based on applied linguistics, which creates languages for the operative saving of information and for quickly finding answers to incoming requests in data warehouses. On the other hand, the theory of information supplies this research by models and methods that are used to organize the circulation of information in data channels. Computer engineering is a completely independent line of applied research that integrates several fields of electrical engineering and computer science required to develop computer hardware. Within this field many problems which are not directly related to informatics are solved. For example, numerous studies are conducted to improve the element base of computers. The progress of modern informatics is unthinkable without the evolution of computers − the main and the only tool for

Relationship with Some Fundamental Sciences 13

1.2.1 Informatics and Mathematics

The mathematics of current computer science is constructed entirely on discrete mathematics , especially on combinatorics and on graph theory. Discrete mathematics (in contrast with continuous mathematics) is a collective name for all branches of mathematics, which deal with the study of discrete structures, such as graphs or statements in logic. Discrete structures contain no more than countable sets of elements. In other words, discrete mathematics deals with objects that can assume only separated values. Besides combinatorics and graph theory, these branches include cryptography, game theory, linear programming, mathematical logic, matroid theory, and number theory, which are used by informaticians intensively. The advantage is that nontrivial real world problems can be quickly explored by using methods of these discrete disciplines. One can even say that discrete mathematics constitutes the mathematical language of informatics. From the beginning, informatics has been very solidly based in mathematics (Figure 1). In addition, theoretical informatics could be considered a branch of mathematics. It is easy to notice that both share a conceptual apparatus. The more precisely circumscribed area of computer science definitely includes many things, which would not be considered mathematics, such as programming languages or computer architecture. The computerization of sciences, including mathematics, also stimulated those sciences as well. Questions from informatics inspired a great deal of interest in some mathematical branches, e.g. in discrete math, especially in combinatorics. Some mathematical challenges arise from problems in informatics (e.g. complexity theory); they demand the use of innovative techniques in other branches of math (e.g. topology). Fundamental problems of theoretical computer science, like the P versus NP problem, have obtained an appropriate importance as central problems of mathematics. The two-way conversation between informaticians and mathematicians is profitable. Mathematicians consider computational aspects of their areas to facilitate construction of the appropriate virtual objects (or entities, in terms of software). Numerous techniques increase mathematical sophistication, which result in efficiently solving the most important computational problems. Informatics has activated many mathematical research areas like computational number theory, computational algebra and computational group theory. Furthermore, a diversity of computational models was drafted to explain and sometimes anticipate existing computer systems, or to develop new ones like online algorithms and competitive analysis, or parallel and distributed programming models. Here are some explanations for the exemplary models:

  • An online algorithm has a special restriction; it does not receive its input data from the beginning as a whole, but in batches (rounds). After each round, the algorithm has to provide a partial answer. An example is an allocation of CPU time or memory (scheduling), because in general it is not known which processes will

14 Introduction to Informatics

require resources, it is necessary to allocate resources only based on the current situation. Competitive analysis compares the performance of such algorithms to the performance of an optimal offline algorithm that can view the sequence of requests in advance.

  • Distributed computation is a solution for the big data problem. Unfortunately, this is very difficult to program due to the many processes which are involved, like sending data to nodes, coordinating among nodes, recovering from node failure, optimizing for locality, debugging and so on. The MapReduce programming model is suggested which allows such data sets to be processed with a parallel, distri- buted algorithm on a computer cluster. The model was inspired by functional programming. Applications which have implement the MapReduce framework, achieve high scalability and fault-tolerance, which is obtained by optimizing the execution engine.

1.2.2 Informatics and Mathematical Logic

Mathematical logic or symbolic logic is a branch of mathematics that studies mathematical notation, formal systems, verifiable mathematical judgments, computability, and the nature of mathematical proof in general. More broadly, mathematical logic is regarded as a mathematized branch of formal logic or logic, developed with the help of mathematical methods. Topically, mathematical logic stands in close connection with meta-mathematics, the foundations of mathematics, and theoretical computer science. Logic in computer science is the direction of research, according to which logic is applied in computation technologies and artificial intelligence. Logic is very effective in these areas. However, one should not forget that some important research in logic were caused by the development of computer science, for example, applicative programming, computation theory and computational modeling. From the very beginning, informatics depends heavily on logic; do not forget that Boolean logic and algebra was used for the development of computer hardware. Logic is a part of information technology, for example, in relational data models, relational databases, relational algebra, and relational calculus. In addition, logic has provided fundamental concepts and ideas for informatics, which naturally can use formal logic. For example, this applies to the semantics of programming languages. Here are some very important applications of logic in the field of informatics:

  • Formal methods and logic reasoning about concepts in semantic networks and semantic web;
  • Problem solving and structured programming for application development and the creation of complex software systems;
  • Probative programming − the technology of development of algorithms and pro- grams with proof of the correctness of algorithms;

16 Introduction to Informatics

mode, so these specific states are only two; they are referred to as zero and one. The theory of design and operation of the various electronic components like transistors is carried out by electronics , which is a part of electrical engineering and a branch of physics. To emphasize the type of basic materials used in modern electronics components and circuits, the terms semiconductor electronics or solid state electronics are used. To underline the digital operating mode of electronic circuits intended for the construction of computers, the term digital electronics is used. Computer engineering fuses electronics with computer science to develop faster, smaller, cheaper, and smarter computing systems. Computer engineers design and implement hardware (computer architectures); design and implement systems software (operating systems and utility software); design processors; design and implement computer-computer and human-computer interface systems. One can say that computer engineers are engaged in analyzing and solving computer-oriented problems. Computer engineering also deals with the embedded computers that are parts of other machines and systems, as well as with computer networks, which are developed for data transfer. It is an interesting fact that progress in micro-miniaturization of integrated electronic circuits has been and remains heavily supported by informatics. There have long been popular industrial software tools for designing electronic systems such as printed circuit boards and integrated circuits called ECAD (electronic design automation).

1.2.5 Informatics and Linguistics

Computational linguistics joins linguistics and informatics; it is one of the cognitive sciences, which partly overlaps with the field of artificial intelligence and concerns understanding natural language from a computational perspective. Theoretical computational linguistics deals with formal theories about language generation and understanding; the high complexity of these theories forces the computer management of them. One of the main goals of theoretical study [8] is the formulation of grammatical and semantic frameworks for characterizing languages enabling a computational approach to syntactic and semantic analysis. In addition, computational linguists discover processing techniques and learning principles that exploit both the structural and distributional properties of language. This kind of research is not possible without the answer to the question: how does language learning and processing work in the brain? Applied computational linguistics, called language engineering , focuses on the methods, techniques, tools and applications in this area. Historically, computational linguistics was initiated to solve the practical problem of automatic written text translation. Machine translation was recognized as being far more difficult than had originally been assumed. In recent years, we can observe the growth of technology for

Relationship with Some Fundamental Sciences 17

analysis and synthesis of spoken language (i.e., speech understanding and speech generation). Besides designing interfaces to operate in a natural language, modern computational linguistics has considerable achievements in the field of document processing, information retrieval, and grammar and style checking. Computational linguistics partially overlaps with natural language processing , which covers application methods of language description and language processing for computer systems generally. This entails:

  • Creation of electronic text corpora;
  • Creation of electronic dictionaries, thesauri, ontologies;
  • Automatic translation of texts;
  • Automatic extraction of facts from texts;
  • Automatic text summarization;
  • Creation of natural language question-answering systems;
  • Information retrieval.

Computational linguistics also embraces topics of computer-assisted second language learning. It is not only about the use of computers as an aid in the preparation, presentation, and assessment of material to be learned. The modernization of educational methods, such as the utilization of multimedia, and web-based computer assisted learning is also sought.

1.2.6 Informatics vis-à-vis Psychology and Sociology

There are more reasons for social, behavioral, or cognitive scientists (psychologists and sociologists) to acquire a basic familiarity with informatics tools and techniques, which give them a new ability to provide, publish and evaluate their research. One can speak about a new phenomenon − computational social science [9]. Here is an incomplete list of what a new computational approach can bring to psychology and sociology:

  • Web-based data collection methods;
  • Mobile data collection method;
  • Data manipulation and text mining;
  • Computerized exploratory data analysis and visualization;
  • Big Data and machine learning applications.

A new, widely unknown discipline psychoinformatics [10] already helps psychologists to mine data and to develop patterns based on relations among data; the pattern finally reflects specific psychological traits. Furthermore, the development of psychology and informatics is on such a stage, when psychology can try to model the information processes of human thought. Some cognitive scientists even define metaphorically psychology as the informatics of the human mind [11]. In their view,

Information Theory 19

The impact of the economy on IT can be seen in the formation of an agent- based programming paradigm. The idea for this paradigm came from a commercial brokerage model. A software agent is a kind of small independent program, which mediates the exchange of information between other programs or people. Agents must be prepared to receive incorrect data from another agent, or to receive no data at all; this means that agent-based systems should be prepared to function in conditions of uncertainty. Agent-based modeling promises new analytical possibilities not only for business, but also for social sciences. At present, computing scientists expect help from economists to solve the problems of optimization of software agents. Completely new types of algorithmic problems from natural sciences are challenging theoretical informatics: namely, problems in which the required output is not well defined in advance. Typical data might be a picture, a sonogram, readings from the Hubble Space Telescope, stock-market share values, DNA sequences, neuron recordings of animals reacting to stimuli, or any other sample of “natural phenomena”. The algorithm (like the scientist), is “trying to make sense” of the data, “explain it”, “predict future values of it”, etc. The models, problems and algorithms here fall into the research area of computational learning and its extensions.

1.3 Information Theory

The term information entered into scientific use long before the rapid development of electronic communications and computing. Information (from the Latin word informatio , which means clarification, presentation, interpretation) in the most common sense means details about something transmitted by people directly or indirectly. Initially, this concept was only associated with communicative activities in the community. The understanding of information to be messages sent by people (oral, written or otherwise, like conventional signals, technical facilities, etc.), remained until the mid 20-ies of XX century. Gradually, the concept of information has gained a more and more universal meaning. Before the beginning of the 1920s, the information was treated on a qualitative level, any formal concepts, procedures and methods of quantifying were not used. The main focus was on the mechanisms of influence on receivers of information, to the ways of insurance its accuracy, completeness, adequacy, etc. The subsequent refinement of information’s scientific meaning was done by scientists in different directions. At first, they tried to include it as part of the structures of other general concepts (e.g. probability, entropy, and diversity). R. S. Ingarden (1963) proved the failure of these attempts. The development of the technical facilities of mass communication (the telephone, telegraph, radio, television, computer networks) led to a snowballing in the number of transmitted messages. This led to the need to evaluate various characteristics of information, and in particular its volume. With the development of cybernetics,

20 Introduction to Informatics

information become one of the main categories of its conceptual apparatus, along with concepts such as management and communication. Nowadays, the field of information theory partly belongs to mathematics, computer science, physics, electronics and electrical engineering. Important sub- fields of information theory are measures of information, source and channel coding, algorithmic complexity theory, information-theoretic security, and others. Information theory investigates the processes of storage, conversion and transmission of information. All this processes are based on a certain way of measuring the amount of information. A key measure of information is entropy, which is expressed by the average number of bits needed to store or communicate one symbol in a message. Entropy quantifies the uncertainty involved in predicting the value of a signal. As it was first applied, mathematical information theory was developed by Claude E. Shannon (1948) to find fundamental limits on signal processing operations. It was based on probabilistic concepts about the nature of information. The area of the correct use of this theory is limited to three postulates: 1) only the transmission of information between technical systems is studied; 2) a message is treated as a limited set of characters; 3) the semantics of messages is excluded from the analysis. This theory gave engineers of data transmission systems the possibility of determining the capacity of the communication channels. It focuses on how to transmit data most efficiently and economically, and how to detect errors in its transmission and reception. Information theory is sometimes seen as the mathematical theory of information transmission systems, because the primary problems of this science arose from the domain of communication. It establishes the basic boundaries of data transmission systems, sets the basic principles of their development and practical implementation. Since its inception information theory has broadened to find applications into many other areas, e.g. data compression and channel coding. Shannon’s theory, which studies the transmission of information, is not engaged in the meaning (semantics) of the transmitted message. As an attempt to overcome its limitations, new versions of Shannon’s mathematical theory of information appeared: topological, combinatorial, dynamic, algorithmic and others. However, they take into account only the symbolic structure of messages, and they can be attributed only to theories of syntactic, not semantic type. There is a later, complementary piece of information theory that draws attention to the content through the phenomenon of lossy compression of the subject of the message, using the criterion of accuracy. Unfortunately, the attempt made to connect a probabilistic-statistical (syntax) approach and semantic approach has not led yet to any constructive results. Nevertheless, it may be noted that the category information has infiltrated the relevant scientific fields, so it has obtained the status of a general scientific concept. To the original meaning of information were added:

  • A measure of reducing the uncertainty as a result of the message;
  • A collection of factual knowledge, e.g. circulating in the management process;