The pigeonhole prinicple dictates that if we have seperate boxes or weights for each fact that we want to teach/store in an AI then we will need to grow the network with every fact that did not fit already.

Can we compress the data? If they are dependant, yes.

But that still leads to "random" facts like names of places and people.

An example

If I want the AI to remember corretly that I wrote this blog post and that my name is Stefan and that I posted it on estada.ch.

So, first we need the AI to overfit on this article and then on the my name (there is no reason for that particular name) and the domain (also completely random name that I liked).

This means that any AI needs at least 3 weights just to connect this article (that one alone is probably more than one weight) to correctly recognise these facts.

Adding more and more facts

This just increases the training time so this approach will become very expensive very quickly:

time to train vs number of weights

A proposal

I my opinion the future of a correct AI needs three things:

Partition the AI into an intelligent part and a hard facts part
Show the enduser how certain an answer is
Show where the hard facts came from

The hard facts can be multiple databases. For example the phone book or DNS could be one source of truth. An internal Wiki or encyclopedia could be another source.

Combining these sources then detecting conflicting information and highlighting them will allow a user to actually work with AI.

Because if we can not trust it, the only use cases are art and places where hallucinations are acceptable or even desired.