Saturday, 3 May, 2014 —
practice
development
podcasts
I have a roughly 20 to 25 minute commute five days a week from our house in North Raleigh to NC State’s Centennial Campus. Most days, I use the drive time to listen to a podcast.
I have two must listen podcasts every week:
There are two other, more technical, podcasts that I also listen to and cycle in every couple of weeks:
I have about 3 1/3 to 4 hours of otherwise lost time each week driving. I use that time to listen to these podcasts for two related reasons.
First, I do not have a better time to listen to these podcasts, largely because of how my sense of time and focus work. All four of these podcasts demand my full attention. I could not be successful in listening to these podcasts and trying to program or read attentively. I use music in those instances. I can, however, focus well on the task of driving, and pay attention to the podcast fairly well.
Second, all four of these podcasts feature smart people in their respective crafts thinking out loud. I, in turn, find myself thinking as a result and drive time is time I am not otherwise trying to focus creative attention elsewhere.
Each of these four podcasts is worth a post of it’s own, which will come in due time. I am, however, capped at these four. There are weeks that I can listen to only one full and part of another episode, depending on episode length. There are weeks where I really need to listen to music for a day or two. While there are many other great podcasts around that I would love to also listen to, I listen to these four at most because that’s as much as I can stay reasonably close to caught up with given how much time I’m willing to spend on it.
This is one of my established daily practices.
Tuesday, 29 April, 2014 —
development
improvement
Back in the fall, Ben Orenstein of thoughtbot was on the Ruby Rogues podcast and talked about sharpening tools. Making his vim profile better, interacting with his desktop better. In some way, he’s making his life as a developer better, every day.
He said:
I’m a huge believer in the power of habits. I think the things you can manage to make yourself do regularly can have incredible results. And so, a few years ago, I’d say five years ago, I decided to get kind of serious about making my environment really excellent and improving my efficiency that way. And so, I got in this habit of spending the first ten or 15 minutes of my day on tool sharpening. And so, what I started was I started a little text file that I would add to during the day. So, when I was doing something that felt inefficient or felt like whenever I had that inkling, “There must be a better way of doing this,” I’d add it to the list. And then I’d pull one of them off in the morning.
And so, I started most mornings by just doing something simple like making an alias for a command I use in the shell a lot. Or something like, I finally need to research how the Vim expression register works and go do some diving on a readme or something like that. And I thought of it as sort of slowly sanding down the rough edges of my environment. So, anything that kind of like irked me, I would try to spend a little time on every morning. And what I found was not very long of this, I was noticeably faster at the things I needed to do every day. And it was starting to have a huge impact on my productivity. And so, I started talking about that.
Five years of daily tool sharpening or tool making seems like it would lead to some transformative changes in work habits and flow. To that end, I’ve spent some time recently attempting to make some adjustments to my own environments. I’ll enumerate some of them:
- Read up on homebrew services
- Adjusted my tmuxinator set-up
- Added a profile for a Tech Week project
- Added tmuxinator invocation shortcut
- Installed the Silver Searcher
- Set-up ctags in a project as an experiment
- Installed + purchased Dash
- Repaired tools + resources links on my tools page
- Installed MsgFiler
- Created a clipping to insert an Emacs-style counterpart comment in BBEdit source files
- Found a Safari extension that can normalize the size of Safari windows
- Installed Total Terminal for a keyboard shortcut available terminal window I can use for one-off commands.
- Updated BBEdit prefs to use Dash for “Find in Reference”
This isn’t an exhaustive list of what I’ve done since my curiosity was piqued by what Orenstein was describing. I wanted to share the list to inspire someone else to try out the practice and prompt myself to reestablish the habit.
I plan to revisit this with new entries occasionally.
Sunday, 9 February, 2014 —
engineering
civics
David E. Sanger and Eric Schmitt, reporting for the New York Times, have published an article titled “Snowden Used Low-Cost Tool to Best N.S.A.”. I know they’re reporting for a general audience, but I believe the article does a disservice by allowing anonymous national security “officials” to put simple automation into scare quotes:
Using “web crawler” software designed to search, index and back up a website, Mr. Snowden “scraped data out of our systems” while he went about his day job, according to a senior intelligence official. “We do not believe this was an individual sitting at a machine and downloading this much material in sequence,” the official said. The process, he added, was “quite automated.”
The findings are striking because the N.S.A.’s mission includes protecting the nation’s most sensitive military and intelligence computer systems from cyberattacks, especially the sophisticated attacks that emanate from Russia and China. Mr. Snowden’s “insider attack,” by contrast, was hardly sophisticated and should have been easily detected, investigators found.
Automation gonna automate, I suppose. Given that we’ve seen this dance with Aaron Schwartz, Chelsea Manning and Edward Snowden, the national security-industrial complex has a disingenuously naïve view of automation tools, particularly around Schwartz at MIT and Snowden, suggesting there was a mix of luck and quite possibly something nefarious to all this automation. The New York Times should approach statements made by agency officials skeptically. This sort of programming is not hard. Moreover, no one has to work particularly hard to hide this. In fact, what might look to some like “hiding” would simply be polite engineering under a different lens.
One key is a not-at-all-advanced concept of throttling. Well-behaved web crawlers (also known as spiders) are respectful about how many requests they issue in a given amount of time. A lot of requests all at once will attract the very sort of attention unnamed officials seem beside themselves to acknowledge Snowden only barely called to himself.
First, lots of requests in a short amount of time shows up in log files as such and quickly becomes a pattern. Patterns attract attention. Assuming the NSA and it’s various contractors audit access logs (which itself is something I’d automate), spreading requests over time makes it less likely to arouse suspicion. Moreover, unless an audit is looking for a particular type of activity, that manual or automated audit will not care a whit about well-throttled crawler traffic, because it looks a lot like expected traffic. It’s “hiding” to the same degree someone of average height and dress is “hiding” as they walk on a Manhattan sidewalk.
Second, setting aside any activity logs, system activity monitors seem more likely to catch a misbehaving web crawler. System activity monitors look at how much work a machine is doing at a given time. Typical checks look at how busy the CPU is, how much RAM is in use, overall network activity, what processes (“programs”) are running and so on. Some servers have automated checks in place, some don’t. For sake of discussion, I assert the servers hosting the content Snowden accessed were monitored in such a fashion. Now, assume each server has a variable amount, but average band of activity. Unless what Snowden was doing with his web crawler caused one of these checks to go out-of-bounds, it was unlikely to attract attention. Normal activity gets ignored.
On to the alleged crawling software itself.
In interviews, officials declined to say which web crawler Mr. Snowden had used, or whether he had written some of the software himself. Officials said it functioned like Googlebot, a widely used web crawler that Google developed to find and index new pages on the web. What officials cannot explain is why the presence of such software in a highly classified system was not an obvious tip-off to unauthorized activity.
First, Snowden’s job was as a systems administrator. Systems administration and development jobs involve access to not in any way top secret technologies like *NIX servers which typically have a wide-array of built-in scripting languages (Perl and Python most likely, Ruby very possibly). Or, perhaps Snowden is a shell scripter. Bash will get the job done.
As software goes, a basic web crawler is not exceptionally hard. I assert if its written with tools likely already resident on any average server or *NIX-based laptop (e.g. Mac OS X, Linux, possibly Windows with PowerShell), there’s really nothing about one that would raise any particular suspicion. Effectively, the raw pieces of the web crawler were quite likely already present. Writing a text file to marshal these raw pieces together is unlikely to raise suspicion because a systems administrator or software developer already has scores of similar files laying around. There’s not a magic “web crawler” bit that flips and will alert anyone.
As a thought experiment, what happens if every machine is audited and new and modified files are flagged, logged and sent off somewhere for analysis? Probably nothing, because in a large working group, a lot of these files are going to look very similar to each other, have innocuous or cryptic names and it would be a nigh-impossible task to write meaningful software to determine what all of these new files are for and, if they’re programs, what they do. Surely, no one is going to look at each one of these files. It’d be soul-sucking work.
Put another way; hammers, screwdrivers, wrenches, pliers, saws, knives aren’t noteworthy tools in a tool box. A new hammer on a construction site is unlikely to raise any attention. Similarly, just as carpenters use jigs, painters use scaffolding and auto mechanics use impact wrenches, ramps and hydraulic lifts to make their jobs easier, faster, more consistent and less tedious, systems engineers and developers use scripts. Now, imagine a construction site or factory inspecting everyone’s tool bag and workspace constantly for anything “inappropriate”. It wouldn’t be terribly effective and it’d be a huge burden and expense on the actual work. Imagine your average TSA security line at the office park.
There’s also some question about the web crawler having Snowden’s credentials:
When inserted with Mr. Snowden’s passwords, the web crawler became especially powerful. Investigators determined he probably had also made use of the passwords of some colleagues or supervisors.
But he was also aided by a culture within the N.S.A., officials say, that “compartmented” relatively little information. As a result, a 29-year-old computer engineer, working from a World War II-era tunnel in Oahu and then from downtown Honolulu, had access to unencrypted files that dealt with information as varied as the bulk collection of domestic phone numbers and the intercepted communications of Chancellor Angela Merkel of Germany and dozens of other leaders.
Officials say web crawlers are almost never used on the N.S.A.’s internal systems, making it all the more inexplicable that the one used by Mr. Snowden did not set off alarms as it copied intelligence and military documents stored in the N.S.A.’s systems and linked through the agency’s internal equivalent of Wikipedia.
As noted above, there’s nothing particularly special about a web crawler versus any other manner of script. It’s easy to inform utilities like wget
and curl
about authentication parameters and keep login cookies. It’s also easy for such a web crawler to announce itself to the server it requests information from in any manner. There’s a convention around giving an identification string, as Google and Yahoo do for their web crawlers, but it’s just as easy to call a web crawler Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko
or Internet Explorer 11. Add in polite engineering of not requesting every page the web crawler sees as soon as it processes each preceding page and it’s going to be far less obvious that traffic to a web server is coming from a script instead of a human clicking a link. There’s not necessarily anything nefarious going on.
If Snowden had access to all of these systems and accessing what sounds equivalent to a corporate intranet was not going to arouse suspicion, there’s little I can think about this conceptual web crawler that would tip the balance into being caught. If the NSA wasn’t going to catch Snowden doing all of the work himself, it’s no more likely they were going to catch an automated process he wrote.
I don’t find any part of this story surprising from a technical standpoint. What I do find somewhat distressing is that unnamed officials think this is special or conveys villainous status on Snowden. It doesn’t, just as it should not have with Aaron Schwartz. Said officials should actually know better and if they don’t, they need to find technical advisors who will correctly inform them.
I bring this all up because I would like for reporters on stories such as this to find an average systems administrator, security analyst or software engineer to talk to in order to provide perspective. The New York Times has an excellent digital staff with developers who could easily demonstrate what a similar script would look like and how it would work and look internally. Surely, a news organization that builds great interactive stories and is growing more comfortable in its own clothes online can use some agency and draw on some of the experience that’s helping to provide some of that comfort to call officials on bad, self-serving analysis like this.