All the data you need.

Tag: Computing

How to Get Started as a Data Engineer
If you enjoy working with data, or if you’re just interested in a career with a lot of potential upward trajectory, you might consider a career as a data engineer. But what exactly does a data engineer do, and how can you begin your career in this niche? What Is …
Guide to the recent flurry of posts
I wrote six blog posts this weekend, and they’re all related. Here’s how. Friday evening I wrote a blog post about a strange random number generator based on factorials. The next day my electricity went out, and that led me to think how I would have written the factorial RNG …
Filling in gaps in a trig table
The previous post shows how you could use linear interpolation to fill in gaps in a table of logarithms. You could do the same for a table of sines and cosines, but there’s a better way. As before, we’ll assume you’re working by hand with just pencil, paper, and a …
Tables and interpolation
The previous post about hand calculations involved finding the logarithm of a large integer using only tables. We wanted to know the log base 10 of 8675310 and all we had was a table of logarithms of integers up to 1350. We used log10 867 = 2.9380190975 log10 868 = …
Much ado about NaN
I ran across a GitHub repo today that features an amusing hack using the sign bit of NaNs for unintended purposes. This is an example of how IEEE floating point numbers have a lot of leftover space devoted to NaNs and infinities. However, relative to the enormous number of valid …
Planetary code golf
Suppose you’re asked to write a function that takes a number and returns a planet. We’ll number the planets in order from the sun, starting at 1, and for our purposes Pluto is the 9th planet. Here’s an obvious solution: def planet(n): planets = [ "Mercury", "Venus", "Earth", "Mars", "Jupiter", …
Beta inequalities with integer parameters
Suppose X is a beta(a, b) random variable and Y is a beta(c, d) random variable. Define the function g(a, b, c, d) = Prob(X > Y). At one point I spent a lot of time developing accurate and efficient ways to compute g under various circumstances. I did this …
Unix via etymology
There are similarities across Unix tools that I’ve seldom seen explicitly pointed out. For example, the dollar sign $ refers to the end of things in multiple contexts. In regular expressions, it marks the end of a string. In sed, it refers to last line of a file. In vi …
Functions in bc
The previous post discussed how you would plan an attempt to set a record in computing ζ(3), also known as Apéry’s constant. Specifically that post looked at how to choose your algorithm and how to anticipate the number of terms to use. Now suppose you wanted to actually do the …
Planning a world record calculation
Before carrying out a big calculation, you want to have an idea how long various approaches would take. This post will illustrate this by planning an attempt to calculate Apéry’s constant to enormous precision. This constant has been computed to many decimal places, in part because it’s an open question …
Parallel versus sequential binding
If someone tells you they want to replace A’s with B’s and B’s with A’s, they are implicitly talking about parallel assignment. They almost certainly don’t mean “Replace all A’s with B’s. Then replace all B’s with A’s.” They expect the name of the Swedish pop group ABBA to be …
Differential Equations and Department Stores
Howard Aiken on the uses of computers, 1956: If it should turn out that the basic logics of a machine designed for the numerical solution of differential equations coincide with the logics of a machine intended to make bills for a department store, I would regard this as the most …
Also a crypto library
The home page for the OpenSSL project says OpenSSL is a robust, commercial-grade, and full-featured toolkit for the Transport Layer Security (TLS) and Secure Sockets Layer (SSL) protocols. It is also a general-purpose cryptography library. … If you’ve never heard of the project before, you would rightly suppose that the …
Offline documentation
It’s simpler to search the web than to search software-specific documentation. You can just type your query into a search engine and not have to be bothered by the differences in offline documentation systems for different software. But there are a couple disadvantages. First, the result may not be that …
Finding computer algebra algorithms with computer algebra
I ran across an interesting footnote in Wolfram Koepf’s book Computer Algebra. Gosper’s algorithm [1] was probably the first algorithm which would not have been found without computer algebra. Gosper writes in his paper: “Without the support of MACSYMA and its developer, I could not have collected the experiences necessary …
Collatz analog in C
A few days ago I wrote about an analog of the Collatz conjecture for polynomials with coefficients mod m. When m = 2, the conjecture is true, but when m = 3 the conjecture is false. I wrote some Mathematica code on that post to work with polynomials as polynomials. …
Upper case, lower case, title case
Converting text to all upper case or all lower case is a fairly common task. One way to convert text to upper case would be to use the tr utility to replace the letters a through z with the letters A through Z. For example, $ echo Now is the …
Technological boundary layer
The top sides of your ceiling fan blades are dusty because of a boundary layer effect. When the blades spin, a thin layer of air above the blades moves with the blades. That’s why the fan doesn’t throw off the dust. A mathematical model may have very different behavior in …