Train Your Own Neural Network

These days, using Machine-Learning and particularly Deep-Learning solutions to solve many technical challenges has become a norm.

That’s mainly thanks to having access to unprecedented volumes of data, hardware advancements, and academic progress.

Many problems are tackled by modeling Neural-Networks, feeding them with tons of data, and consequently they “learn” and turn artificially “smarter”.

While we, humans, are still smarter than our computers - we do suffer from an inferior processing speed of information.

We can’t read a million books over throughout our lifetime.
Neither can we write a billion lines of code,
speak fluently 100 different languages or paint a million drawings.

Yet, we are still able to accomplish quite a lot.
Much more than we think we can.

There is the classic saying that “Practice makes Perfect”.
This is partly true because it’s also that “Practice also makes you Permanent”.

Now usually comes the part saying that we need to do Deliberate Practice consistently for many years. The thing is that there is a multitude of ways to practice deliberately. There is no one size fits all formula applicable to all domains. And of course - people are different.

I’d like this article to focus on a single deliberate practice side - I call it the “Train Your Own Neural Technique” technique.

For brevity, I’ll use the TYONT acronym for the rest of the post.
The essence of TYONT is feeding your brain with tons of data relevant to things you care about.

I’m dividing that training data into two categories: Factual and Pattern-Oriented. First, let’s address data belonging to the Factual category. We’ll differentiate between general data to domain-specific data.

Factual data - General:

This is straightforward - it’s pieces of data that are mere facts.

Examples:

Historical
- When did the French Revolution take place?
- Who was the president of the United-States during World War II?
Wisdom Quotes
- “Success is never found. Failure is never fatal. Courage is the only thing. (Winston Churchill)”
- “Education is what remains after one has forgotten what one has learned in school. (Albert Einstein)”
Vocabulary
- Give two synonyms to the word: decisive

Factual data - Domain-specific:

Since I’m a developer and this blog assumes its readers are too, I’ll use programming factual data here but you could see how it translates to many other fields.

Shortcuts
- How do I re-open Chrome’s last closed Tab?
Commands
- How do I discard the local git uncommitted changes via the command-line?
Syntax / Libraries
- How to create a new module in that language?
- What’s the 2nd parameter this common library’s function signature expects?

flashcards-image

Patterns:

Now, let’s move on and talk about the more interesting category: Patterns. As opposed to factual data which is well-defined - Patterns aren’t so.

When I’m referring to a Pattern here, I refer to a core solution-technique relevant to a wide class of problems.

If there is a problem with a unique (or rare) solution that isn’t translatable to other similar cases - then I’d NOT consider it as a Pattern.

Although there is no well-defined algorithm that given a challenge, could say if its solutions consist of common techniques or not - I’d still try my best to argue there is a place for having that category. Please bear with me.

So instead of writing: “Challenges having common solutions that can be reused in many places” - I’d stick with the term a Pattern.
I’m also aware that what one would view as a Pattern, another person wouldn’t - and that’s totally fine.

What’s important is that anyone will be able to possess his classification and decide if he regards it as a Pattern (or not).

Since it’s an individual discernment - here are a few examples of stuff I’d call Patterns:

Math
- Prove that √2 is an irrational number.
  
  Why a Pattern? - The classic solution to the above is proved by contradiction. We assume there exist two natural numbers 𝜨 and 𝜧 having no common divisors such that 𝜨 / M = √2
  Building on top of this - we’ll eventually reach a dead-end to the assumption that 𝜨 and 𝜧 have no common divisors.
  (click here for the full proof).
  
  I see it as a Pattern since I can imagine many Math problems that would require me to use a similar technique as we did here with 𝜨 and 𝜧.
Chess
- Given the following board position - how Black can win in 1 move?
  
  Why a Pattern? - The solution to this challenge is pretty easy (you can try to solve it here). I’d call it a Pattern since there are tons of similar chess challenges that will require using the same way of thinking.
Reading Code
- Given high-quality logging library source code having 3K LOC - understand its inner workings
  
  Why a Pattern? - Well, that’s a bit obscure. Understanding a self-contained code-base such as a logging library entails a couple of things I’d categorize as Pattern. It contains abstractions, mental-models that can be used in other places.
  
  While thoroughly reading such an infrastructure library you should expect to come across things such as:
  - buffering: a common technique is to collect log-entries in-memory before flushing it to disk / sending it via a socket or similar.
  - multi-threading: if the library internals can be accessed simultaneously from multiple threads, we can see how the library protects critical sections and shared resources. Maybe it’ll also have threads in charge of deleting stale data (for example log entries that are buffered for too long). Or, flushing data might involve transferring data into some background thread.
  - log-rotations: if we persist data to a file - we may need logic that will remove old data. Similarly, if we limit each file for 10MB, the code will have to prune the oldest entries or do something else (like saving new data to a new file).
  - formatting: if the library expects structured-logs it might have code for formatting structs to strings.
  - parsing: for raw data, we might have parsers that will validate it and transform it into some structured shape.
  - compile-time macros: in case we’re compiling for production we’d like to omit tracing code.
  - fault-tolerance: robust logging mechanism should guard against a burst of logging. It’ll rate-limit the number of calls/sec. If the logging is to a remote machine, it may wrap the calls with a circuit-breaker.
  - tests: it’s always a good place to learn how to use the library’s API and get ideas.
    
    How the library simulates logging failure?
    How the library use fuzzing for detecting an edge-case bug.

patterns-image

Our brain is very good at recognizing patterns. Given an image, it’ll detect objects in a fraction of a second.

Math enthusiasts can receive a Math challenge and at once think of a few solving strategies that should be tackled and eliminate a couple of others. That’s because they have years of accumulated problem-solving knowledge.

Professional Chess players looking at a game board will instantly infer a lot about it, weigh trade-offs, and have a gut feeling what are the best options for next moves. That’s owing to years of playing and encountering in numerous situations.

Experienced developers can start reading a high-quality codebase consisting of a few thousand lines of code and after a short while figure out their way through it. Then, they can deduce what regions lack structure and require more refactoring, suggest architectural changes, and have a mental model of the code inside the head. They can do that since they’ve read and written tons of code.

The more we’ll feed our brains with varied inputs - the larger our toolbox will expand. It means we could solve not only more problems but also more elegantly (and faster) since we’ll have more choices to choose from.

How to Train?

brain-image

So… how to TYONT?
Again, let’s split it into Factual-data and Patterns.

Factual-data Training:

In order to TYONT with facts, you first need to decide what kind of information we want to remember. You may want to extend your vocabulary, learn a new language, remember quotes and list goes and on. My advice is to start with one or two topics as most.
Then you should use the Spaced-repetition technique to optimize what to study and when. This will assist you in remembering the most in less time and retain it better.

I think it’s a shame that most people aren’t familiar with Spaced-repetition. It’s so powerful and rewarding. I highly recommend using Anki for applying it. You can think about it as your agent that throws at you flash-cards with questions and asks you to answer them. To make it as effective as you can, you’re encouraged to create your cards. It will assist with remembering the data most effectively.

In case you want to expand your vocabulary - Super-Memo is the best Spaced-repetition resource I know of. Not only it asks you a question - after you ask to review the answer, but it will also pronounce it. Additionally, The app offers synonyms to the desired answer - another major advantage of this app.

Spaced-repetition can be super-effective for programmers as well. A classic use-case is making flashcards for keyboard shortcuts or shell commands. Another good usage is having flashcards for Syntax. We indeed have Google and Stackoverflow and great IDEs extensions with auto-complete and such. However, if you find yourself looking yet again for a standard-library function that you did already twice over the past week - it might be a good indication to walk the extra mile and try saving the next hop to Google.

I strive to know I’m able to open a text editor having no plugins and just start coding. Don’t get me wrong here - we are paid to solve problems not memorize stuff that can be picked up easily. I’m suggesting that we’re able to streamline common stuff that we repeatedly do. It’s up to anyone to pin-point what’s the stuff that seems to slow him down.

Moreover, I’d argue that most developers are using a single dominant programming-language in at-least 80% of their time and a handful of keyboard shortcuts (or worse - using the mouse). Imagine the ROI boost of being fluent with syntax or shortcuts you’re using repeatedly daily. The real benefit won’t be the time-savings - it’ll be the context-switches reduction. You’ll find yourself much less wandering and consequently staying in focus.

Now that we’ve covered the Spaced-repetition based training - we’re left with the Patterns Training.

Patterns Training

The answer to how to TYONT using Patterns Training is unsurprisingly less obvious. It’s true that we can create flashcards for math riddles in a few cases, or take a Chessboard and turn it into a flashcard. That’ll be feasible, but not always - life isn’t that simple.

For example, let’s return to reading source code. Flashcards won’t assist programmers here (maybe in learning keyboard-shortcuts for code navigation). The path to getting better at such skill should be to tailor it to your own needs. One idea can be committing to reading 1000 lines-of-code of a GitHub project daily, That will surely make us somewhat better at reading code.

But we can TYONT better. Let’s instead be more specific. Say that we want to improve code understanding of a specific domain that interests us. We could look for famous GitHub Open-Source projects in that domain.
It’s preferable that code was written using programming-languages we know. Then we can pick one or two projects and start reading them.

After managing to walk through 1-2 similar projects, the 3rd one will become easier to grasp. It’s important to review code written by different great developers. Programming is a craft and there’s always more than a single valid way to do stuff.

Code projects associated with the same domain will usually have some overlap. It could be shared terminology, similar domain abstractions for representing things.

Sometimes the same problem could be tackled in completely different ways - this is very educating to see other ways of approaching the same problem. For example, one could solve a programming challenge of Advent of Code and then read a variety of solutions.

Further than that, reading solutions coded in different programming-language can assist with developing more mental models (i.e: Patterns). Thinking in more paradigms is a necessity to get better (both in writing but also in reading code).

And of course - while reading high-quality code and planting more Patterns seeds inside our brain, we get to learn how to do code with style. It means how to write more idiomatic code, structuring the code-base, naming things better, and more.

As a side-note: learning from others is one of the best ways to get better.
Whether it’s going over a Math challenges solution in a book, reading other people’s code, having a real-person mentor that you can ask questions and similar - All these are super effective.

This piece attempts to raise awareness around the TYONT technique.
Its effectiveness doesn’t get the place it deserves right now.
The article by no means argues that anyone can be good at math, a professional chess player, a world-class musician, or a top-notch developer. You still need to have some talent, passion, persistence and doing many other things to achieve greatness.

In the past, a Math enthusiast could buy mostly the Math books available in his local-store. Maybe in some circumstances, he could grab a book overseas - But how he would be familiar with the existing books? (There was no internet). Today, a Math enthusiast can go to Amazon or similar and order any existing book. If there is online content in a foreign language - he can use a translator.

In the past, if you wanted to be good at Chess you’d need access to good teachers. Today, you can just go to chess.com or a similar website and play for free against millions of people or try solving some chess challenges.

Programmers in the past couldn’t use Open-Source in their work, let alone read code by people outside. Today, anyone has access to billions of lines of code.

Thanks to the internet - we have access to an infinite volume of information. One of the byproducts is that learning from others has never been easier.

We can TYONT and accomplish more than we think and as a bonus have fun too.