**November’s Book:** *Computer Science Distilled *by Ferreira Filho

**Why this book:** I do not have a degree in computer science and this looks to be a good primer to get my mind grapes flowing. My college major, Operations & Information Systems Management, did explore some of these concepts but not in depth. I think it is time for a refresh and a deeper dive.

**Final, Final Takeaways:**

- Each chapter ends with a reference list of books to further explore the chapter’s topics.
- The colophon states the cover image is from an 1845 schematic by Charles Babbage; the first programmable computer!

**Notes/Thoughts:**

**Chapter 1: Basics**

- Flowchart
- states & instructions = rectangle
- decision step = triangle

- Factorials get BIG FAST. 10! = 3,628,800
- Fun fact: human DNA has about 3 million base pairs, replicated in each of the 3 trillion cells of the human body
**Final Takeaways**- Great refresher on mathematical terms/concepts:
**truth tables, permutations (with/without identical items), combinations, probabilities: counting and independent, complementary, and mutually exclusive events**. Definitely can use this chapter as a reference! - XKCD is a great comic
- The Zebra Puzzle is fun

- Great refresher on mathematical terms/concepts:

**Chapter 2: Complexity**

- Two main components when considering what an algorithm will “cost”: time and memory (ongoing calculations take up space)
**Exponential algorithms**(2^n) are much more prohibitively expensive than**quadratic algorithms**(n^2)**Final Takeaways**- I want to learn more about
**Big O notation**– I recognize the term & idea, and feel it is the most vital content of the chapter - I have a book in mind for December (Designing Data-Intensive Applications), but Big O would be a good topic for another month. Eventually, I’d like to be a software or systems engineer and obviously knowing more in this realm will be key!

- I want to learn more about

**Chapter 3: Strategy**

**merge algorithm**O(n)- fixed number of operations
- think 2 lists of fish, sorting alphabetically

**power set algorithm**O(2^n)- double the operations when input size increases by 1
- power set = truth table
- think fragrance combinations from a set of flowers

**recursion**- remember the
**base case** - think palindrome or fibonacci function
**recursion vs. iteration**=> it’s a**trade-off**- more expensive: recursion
- faster speed: iteration
- higher complexity: iteration

- remember the
**brute force**(exhaustive search) O(n^2)- number of pairs in an interval increases quadratically as interval increases
- think
**best trade**: in one month, find the day where buying & selling nets the most profit.**OR**find the optimal buy/sell timeframe

**backtracing**- think
**8 Queens Puzzle** - use recursion… once you get a false value, rollback to the last true value and try again
- “fail early, fail often”

- think
**heuristics**- method that leads to a solution that is good enough
- think chess (man vs. computer):
**after your first 4 moves, 288 billion possible positions**, wow! find a move that is good enough. **Greed Approach**- make best choice at each step, and don’t look back
- think burglar filling a knapsack. No time to remove or consider items already in the knapsack

**divide and conquer**- look for optimal substructure and use recursion

**memoization**- think about
**knapsack…**some calculations done repeatedly - store the result of the repeated calc

- think about
**branch and bound**- divide into subproblems
- find upper & lower bounds of each subproblem
- compare subproblem bounds w/ all branches

**Final Takeaway**- will be very helpful to return to this chapter when tackling a data-parsing problem

**Chapter 4: Data**

**Abstract Data Types:**how variables of a given data type are operated**stack**– LI,FO**queue**– FI,FO**list**– more flexible than stack or queue; many available data operations**sorted list**– fewer operators than list, but items*always*ordered**map**– stores mappings with a key and value (kinda like a Hash?)**set**– unordered group of unique items

**Structures:**how variables/data organized & accessed**array**– instant access, sequential memory space, but can be impractical with large data sets**linked list**– each item has a pointer to memory location of next item**double linked list**– each item has pointers in*both*directions- for either linked list…
*can not find specific nth item in list*

- for either linked list…
**array vs. list**=> it’s a**trade-off**- faster insert/delete: list
- insert/delete at random points: list
- random, unordered data access: array
- extreme performance access: array

**tree**- think about traversing HTML
- nodes & edges
**root**node – no parent**leaf**node – no children**height**– level of deepest node

**binary search tree**- at most, each node can have only 2 children
- left node < parent, right node > parent
**binary heap**– tree where parent must be greater (or smaller) than*both*child nodes

**graph**– no child, parent, or root node!**most flexible structure!****hash**– each item is given a memory position; still needs a large chuck of memory set aside

**Final Takeaway**- How is a
**hash**different from a**map**? I know one falls under how data is operated, and the other organized/accessed, but the definitions feel very similar.

- How is a

**Chapter 5: Algorithms**

- Most important, an efficient algorithm likely exists already to solve your issue!
- Lots of
**sorting**algos: Selection, Insertion, Merge, Quick **Searching:**Sequential, Binary, or use a Hash!- Graphs
**:****Depth First Search**vs.**Breath First Search**=> it’s a**trade-off**!- DFS – down, using a Stack
- BFS – across, using a Queue
- Simple, less memory: DFS
- DFS if need to explore all graph nodes
- BFS if expected location is close to the root

- classic problem: find the shortest path between nodes
- try
**Djkistra**Algorithm - uses a
**priority queue**(as opposed to BFS, which is an auxiliary queue) - huge area? try
**Bidirectional Search**

- try
**Google, PageRank**Algorithm- modeled the web as a graph
- web page = node, links = edges
- the more edges a page has, the higher the rank!

- network, workflow, cost issues? Linear optimization problems are likely best solved with
**Simplex Method** **Final Takeaways**- I think the key here is first to see how your data is modeled, and then look for a algorithm that matches the problem at hand
- Beware of choice! Successful algorithms can have pitfalls, drawbacks

**Chapter 6: Databases**

- The definition of
**Normalize, Normalization**is not what I have used/heard- book: The process of transforming a database with replicated data to one without
- familiar: prepare and/or flatten data for ease of consumption
- I think I have been wrong here (the very reason I am writing this blog & reading these books lol)!

- An
**index**is basically a self-balancing binary search tree **NoSQL**(btw, it can be pronounced either way)- most widely known type:
**document store**- a data entry contains all info app needs
- data entry:
**document** - group of documents:
**collection**

- most widely known type:
**Graph Databases**- data entry = node, relationship = edge
*the*most flexible type of DB!- data modeled like a network? Graph is likely best

**Big Data**: Volume, Velocity, Variety, (Variability, Veracity)**SQL**vs.**NoSQL**=> it’s a**trade-off!**- data-centered: SQL
- maximize structure & eliminate duplication: SQL
- application-centered: NoSQL
- faster development: NoSQL
- time spent on consistency: SQL

**Distributed****Single-Master**Replication- master receives all queries
- forwards to slave (which contains a replica of DB)

**Multi-Maste****r**Replication- load balancer distributes queries
- all masters connected
- write queries propagated amongst all masters

**Sharding**- each computer has portion of DB
- query router send query to correct computer
- use with replication to avoid one shard failing and having portion of DB unavailable

- be wary of
**Data Consistency**;**Eventual Consistency**(many writes among distribution take time to catch up & sync) might not be good enough

**Serialization:**SQL, XML, JSON, CSV

**Chapter 7: Computers**

**bus**: group of wires for transmitting data (think individual RAM component)**address bus:**transmit address/location data (unidirectional)**data bus:**transmit data to and from

**register:**internal memory cell in CPU**instruction set:**collection of all operations- at core of never-ending CPU cycle is
**Program Counter**(PC or**BIOS**)- stores the memory address of next instruction to be executed
- special register
- will hold immutable core logic for computer startup

- CPU
**clock**: number of basic operations per second**2 MHz =**two million operations/second- quad-core
**2 GHz**= close to a billion operations/second

**bit architecture**- 4 bit = processing binary number instructions up to 4 digits
- 8 bit = up to 8, 32 bit = 32 digits, etc
- thus a 64-bit program can’t run on 32-bit
- 64-bit register = 2^64 = over 17
*billion*gigabytes

**endian****little**-endian = store numbers**left**-to-right**big**-endian =**right**-to-left

**compiler**- converts programming language into machine instructions
**turing-complete:****scripting languages**(JS, Ruby, Python) use an**interpreter**to skip compiling (much slower! but immediate code execution)- once compiled, original code is impossible to recover
**BUT**is possible to decode the binary (**disassembly**). this is reverse-engineering, and frequently how programs are hacked (think pirated software that bypasses auth/download code)

**memory hierarchy****Processor-Memory Gap:**RAM is slower- the following help bridge the gap
**Temporal Locality**: if used once, likely to be used again**Spatial Locality**: when address used, near-by addresses likely to be used shortly**L1/L2/L3 Cache**: contents of memory with high probability of being accessed

- Main Memory (RAM) – primary
- Main Storage (
**DISK**) – secondary, could be tape or SSD

**Final Takeaway**- This chapter reminded me a lot of a favorite book,
*Code: The Hidden Language of Computer Hardware and Software*by Charles Petzold. A really great read.

- This chapter reminded me a lot of a favorite book,

**Chapter 8: Programming**

**Values**are**first-class**citizens. I never thought of it this way; I always associated the term “first-class” with functions & JavaScript**Paradigms****Imperative**- first!
- exact instructions at each step with specific commands
**Machine Code**: mnemonic instructions (ex –**CP**,**MOV**using**Assembly ASM**)**Structured**:**GOTO, JUMP**to control execution flow, eventually**conditionals**(for much better control)**Procedural**: an advancement, allowing dry code & reusability

**Declarative**- what you want, not how
**Functional**:- functions are first-class, & thus
**higher-order functions** **closures**- can “remember” a var, and access it in future
- allow for clean approach to global var

- functions are first-class, & thus

**Logic**: best for AI, natural language processing

**Final Takeaway**- Once, I failed an interview code challenge. I now know why because I did not understand or know about closures!