10 Pages of Pascal code vs 1 Shell script
-
Reading the book, Fundamentals of Software Architecture: An Engineering Approach 1st Edition[^] by Neal Ford & Mark Richards and I stumbled upon the following in _Chapter 11. Pipeline Architecture Style_:
Quote:
Donald Knuth was asked to write a program to solve this text handling problem: read a file of text, determine the n most frequently used words, and print out a sorted list of those words along with their frequencies. He wrote a program consisting of more than 10 pages of Pascal, designing (and documenting) a new algorithm along the way. Then, Doug McIlroy demonstrated a shell script that would easily fit within a Twitter post that solved the problem more simply, elegantly, and understandably (if you understand shell commands): tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq -c | sort -rn | sed ${1}q
Software Developer: Architecture can cost you money Software Architect: It's how I feed my family. :rolleyes:
-
Reading the book, Fundamentals of Software Architecture: An Engineering Approach 1st Edition[^] by Neal Ford & Mark Richards and I stumbled upon the following in _Chapter 11. Pipeline Architecture Style_:
Quote:
Donald Knuth was asked to write a program to solve this text handling problem: read a file of text, determine the n most frequently used words, and print out a sorted list of those words along with their frequencies. He wrote a program consisting of more than 10 pages of Pascal, designing (and documenting) a new algorithm along the way. Then, Doug McIlroy demonstrated a shell script that would easily fit within a Twitter post that solved the problem more simply, elegantly, and understandably (if you understand shell commands): tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq -c | sort -rn | sed ${1}q
Software Developer: Architecture can cost you money Software Architect: It's how I feed my family. :rolleyes:
It is a funny story but I'm sure you realize some details got swept under the rug. Beside, from the blurbs on the Amazon page you linked:
Quote:
Everything in software architecture is a trade-off. First Law of Software Architecture
Some trade-offs: 1. The pipeline solution works only in *nix environments while Knuth's algorithm can probably be implemented on many platforms. 2.
tr
,uniq
,sort
andsed
were all written by someone; maybe some of their cost should be added to the cost of the pipeline solution. Or, if the problem of finding most frequent words turns out to be important and frequently used, Knuth's program might become a new utility —dek
. The new solution would than be "just use thedek
command". 3. The anecdote doesn't say how big the text file is and how fast the solution should run. If you have to find the most frequent words in Encyclopedia Britannica in under 10ms, I doubt the pipeline solution would win the day.Mircea (see my latest musings at neacsu.net)
-
It is a funny story but I'm sure you realize some details got swept under the rug. Beside, from the blurbs on the Amazon page you linked:
Quote:
Everything in software architecture is a trade-off. First Law of Software Architecture
Some trade-offs: 1. The pipeline solution works only in *nix environments while Knuth's algorithm can probably be implemented on many platforms. 2.
tr
,uniq
,sort
andsed
were all written by someone; maybe some of their cost should be added to the cost of the pipeline solution. Or, if the problem of finding most frequent words turns out to be important and frequently used, Knuth's program might become a new utility —dek
. The new solution would than be "just use thedek
command". 3. The anecdote doesn't say how big the text file is and how fast the solution should run. If you have to find the most frequent words in Encyclopedia Britannica in under 10ms, I doubt the pipeline solution would win the day.Mircea (see my latest musings at neacsu.net)