Informational Complexity of the Innate Knowledge of Human Beings

From P2P Foundation
Jump to navigation Jump to search


Andreas Keller:

"The following is a very crude estimate of the informational complexity of the innate knowledge of human beings. To be more exact, it is a crude estimate of an upper limit to the information content of this knowledge. It might be off by an order of magnitude or so. So, this is a “back of an envelope” or “back of a napkin” kind of calculation. It just gives a direction into which one would have to go to try to get a more accurate calculation. To get a more exact number, the parameters put in here as mere estimates must be determined with more precision. This can be done by doing some research of the relevant literature and, where the required information has not yet been determined by science, by doing additional genetic and neurological research.

According to the human proteome project, 58 roughly 67 % of the human genes are expressed in the brain. Most of these genes are also expressed in other parts of the body, so they probably form part of the general biochemical machinery of all cells. However, 1,223 genes have been found that have an elevated level of expression in the brain. In one way or the other, the brain-specific structures must be encoded in these genes, or mainly in these genes, or a sub-set of them. Some of these genes are probably not directly involved in determining the distribution and connectivity of neurons and might form part of the brain-specific cellular infrastructure underlying those networks, but in one way or the other, the innate knowledge must be encoded in these genes, whose combined activity somehow leads to the development of the brain. One possible source of error might be that the study might not have looked at genes active in the fetus or in the small child’s developing brain (I do not know), so it is possible there are some more genes involved here; but for the sake of this estimate, I assume that the innate information is somehow represented in these genes.

There are roughly 20,000 genes in the human genome. So, the 1,223 genes form about 6.115 % of our genes (by number). So about 6.115 % of our genes are brain specific. Probably, we share many of these with primates and other animals, like rodents, so the really human-specific part of the brain-specific genes or the humanspecific part of their sequences might be much smaller. However, I am interested here only in an order of magnitude result for an upper limit.

I have no information about the total length of these brain-specific genes, hence I will assume that they have average length. According to some sources, 59 the human genome has 3,095,693,981 base pairs.

60 According to the same source, only roughly 2% of this is coding DNA. There is also some non-coding DNA that has a regulating function, or is involved in the production of some types of RNA, but let us assume that the functional part of the genome is perhaps 3%. That makes something in the order of 92 to 93 million base pairs with a function (probably less), or 30 million to 31 million triplets (remember that base pairs are working in groups of three, each group coding for an amino acid or acting as a start- or stop-signal for transcription). If the brain genes have average length, 6.115 % of this would be brain specific. This makes for something like 1.89 million triplets.

The different triplets code for 20 different amino acids. There are also start- and stop-signals. The exact information content of a triplet would depend on how often it appears, and they are definitely not equally distributed, but let us assume that each of them codes for one out of 20 possibilities. Calculating the exact information content of a triplet will require much more sophisticated reasoning and specific information about the frequency distribution of triplets and hence of amino acids, but for our purposes, this is enough. The information content of a triplet can then be estimated as the dual logarithm of 20. You need 4 bits to encode 16 possibilities and 5 bits to encode 32 possibilities, so this should be between 4 and 5 bits. A more exact value for the dual logarithm of 20 is 4.322. So, we multiply this with the number of triplets and get 8,200,549 bits. This is 1,025,069 bytes, or roughly a megabyte, comparable to the information content of a typical book. These genes might contain a lot of redundancy in the sense that it might be possible to compress a complete description of these sequences (i.e., to “zip” them) to a smaller amount of information. Hence, the information content of the brain coding genes that determine the structure of the brain is in the order of a megabyte, 61 and possibly much smaller.

The structure of the brain is somehow generated out of the information contained in these genes. This is probably an overestimate, because many of these genes might not be involved in the encoding of the connective pattern of the neurons, but instead, for example, in the glial immune system of the brain or other brain specific, “nonneuronal” stuff, and many of them might be the same or nearly the same in apes, monkeys and rodents, so the human-specific part could even be much smaller.

Therefore, in comparison to the complexity of our civilization, the innate part of our knowledge is tiny. It is probably much smaller than that of some specialized animals. What probably happened on the way from specialized animals to human beings was that on one side, specialized innate knowledge disappeared and on the other hand non-specialized, but “plastic” or “reworkable” parts of the brain were expanded.

The cognitive development of humans, therefore, starts with the innate core, but is perhaps mostly guided by the cultural environment."