My grandfather has been working at
the DFW airport as a janitor for years. Recently, his coworkers all got a 75
cent raise, up to $10 from $9.25. There was one caveat: you have to pass what
seems to be a written exam on airport security (the questions relate to TSA
policy and the like). The only problem is, my grandpa doesn't speak English, so
even new employees are currently being paid more than he is. What makes this
situation worse is that the exam is taken on a computer -- my grandpa hardly
knows how to move the mouse.
However, we have something we can
work with: a list of handwritten questions and answers. My grandpa
requested that I print it out with bigger font so he can read it.
But I want to take it further: How
can I create a study plan so as to guarantee his success?
We have the following facts:
1. He cannot understand written
English.
2. He can, however, recognize
English letters and the sounds they make.
3. We have a list of questions,
paired with answers (TRUE or FALSE).
4. He hasn't been in school for a
really long time.
5. He would have no trouble
understanding this if it were in Korean.
Let us examine the list. First,
an obvious approach would be to memorize each question and its
corresponding answer. But we immediately see that this isn't necessary: each
question can be identified with less than its full length. The question now is:
is there a uniform minimal "key length" on how many words
he should memorize from each question? For example, suppose we are given the following
sentences:
· Bob is a chef.
· Bob is not a chef.
· Alice is a chef.
· Alice is not a chef.
Notice that memorizing only the
first word of each question is not sufficient to distinguish the questions,
since 2 of them start with "Bob". We quickly see that 2 is not enough
either, but 3 works -- so 3 is our key length.
After we establish the key length,
we need only to learn to associate the keys -- "Bob is a" is one in
this example -- with the correct answer (in our case, TRUE or FALSE).
The only problem is, the set of
questions we are working with has a minimal key length of almost the entire
length of some of the sentences! So here is the situation:
· Bob is a chef.
· Bob is not a chef.
· Alice is a chef.
· Alice is not a chef.
· Carol is an absolutely fantastic chef who lives in New York.
· Carol is an absolutely fantastic chef who lives in New
Jersey.
Notice that our minimal key length
here would be 10 due to the last two sentences. This means we would be
memorizing our first four sentences.
But naturally, would quickly
realize that they should memorize the first 3 words for the first 4
sentences and the last word for the 5th and 6th. In other words, the uniform
key length restriction is not only inefficient, but also artificial.
Once we throw off the restriction,
we need a method to generate keys for each sentence. A key must be unique and
effective: that is, there needs to be a fast, human-computable function that
takes a sentence and yields a unique key. So essentially we have compressed our
data.
If we restrict our range to
"substrings starting from the beginning or the end", an
example mapping is not hard to figure out:
·
Bob is a chef. => Bob is a
·
Bob is not a chef. => Bob is not
·
Alice is a chef. => Alice is a
·
Alice is not a chef. => Alice is not
·
Carol is an absolutely fantastic chef who lives in New York. => York
·
Carol is an absolutely fantastic chef who lives in New Jersey. =>
Jersey
The corresponding method to compute
this map would be to start from the beginning, try to recognize a substring,
and if you fail, start from the end and try to recognize one.
The great benefit of using
substrings as keys is that the key actually exists inside of the sentence
already, so computing the function is reduced to a recognition task – once you spot the substring in the input, you
output it.
[An aside: Human recall/recognition
is what I like to call "partially fast" -- if it exists in our memory
vault, we're usually pretty quick to recognize it, but if it isn't, the process
is a lot less reliable*. We can model this with an associative array, or hash
table: lookup is usually really fast, but if it doesn't exist we might have to
search through countless locations to verify this. And that’s exactly the
thing: we’re bad at searching, since
we don’t have location-addressable memory. We can see this effect with test
taking: multiple choice is far easier than free-response tasks, since the
latter requires searching through a swath of potentially relevant information.
Whether we retrieve the proper information depends on whether or not the
information has proper associative links (i.e., X reminds me of Y which is
related to Z, so I recall Z), and whether our mind has enough time, energy, or
luck to traverse the right connections or not. On the other hand,
autoassociative tasks are easy as long as the signal is sufficiently clear.]
But we have pigeonholed ourselves
into only learning by syntactic methods. There are various other, more natural
methods, including but not limited to: sounding out the sentences, mapping the
sounds to the correct answer, or interpreting the sentences semantically. The
benefit to sounding out the words is that my grandpa does know how to
rudimentarily translate groups of letters to sounds, so the function is already
there. And sounds tend to be more familiar, so this might help with
memorization.
My grandpa also suggested a method:
he wants translation of all the sentences to Korean, along with the sounded-out
phonetic transliteration. There are two ways we can implement this. The first
way would be to take an entire sentence and map it to the meaning. However, to
me this is horribly inefficient – if you can map it to a meaning, why not map
it to the binary value that we need in the first place—TRUE or FALSE!
The second way makes a little more
sense. You take some significant words of each sentence – in our example it
might be “chef”, “Bob”, “Alice”, “fantastic”, etc. – and map these to the
meaning. This is essentially dimensionality
reduction – each sentence is considered as the sum of a few special “parts”
– in this case a small set of important words which we will have him learn. The
caveat is when the sentences don’t really share much of a common basis – then
memorizing the words and their meaning becomes an extra layer of inefficiency.
So this method is a little situational.
When the student speaks of
“understanding”, dimensionality reduction may be what he/she is trying to do.
Memorization of independent facts, or even just strings, is no intellectual
challenge, but is time-consuming. So the mind wants for a coherent system, one
that is smaller than the initially presented, overwhelming batch of apparently
true sentences. Surely there must be a simple framework upon which all of this
was derived, the brain thinks.
But perhaps it’s not about
smallness, it’s there’s also this factor of familiarity.
We are already familiar with a space of countless principles and ideas – so can
we “embed” these new concepts as a subspace thereof?
In summary, when it comes to
choosing a method for a memorization task, we have to keep two things in mind:
Dimensionality: Are the objects we are
memorizing (sounds, words) complex or simple? That is, how large is the
smallest set of objects (basis) that form the original set by combinations? For
instance, our example with Alice, Bob, and Carol turned out to have a fairly
low dimensionality, since for our problem we only had to recognize a few key
words.
Familiarity: Are the objects we are memorizing
familiar to us – i.e. do the objects
or related objects already exist in memory to any extent?
For example, if we substitute every
letter in the English language sentence by some unique hieroglyph or code
number, although the dimensionality stays the same (we’re just doing a one-to-one
transformation), but familiarity is drastically reduced. It is probably easier
for most to memorize “fnefhew” than “5 4 1 5 9 1 8”, so it might be easier to
learn the encoding and then decode each number into a letter first, especially
if you have lots of these numbers. This is probably why sounding out the words
works as a method, and how mnemonics actually help you memorize things despite
often being longer than the original object. In the same way, memorizing
sentences by sound seems to be advantageous.
*Although we seem quick to declare "I don't recognize this", this is not the same as saying "I have never seen it". One may sometimes not recognize a person's face until much later.