Rust Profiling with DTrace and FlameGraph on OSX

Posted on

This is an alternative to my post Rust Profiling with Instruments and FlameGraph on OSX: CPU/Time, since XCode disabled exporting of the Time/CPU data from Instruments. Many thanks to this post of BrendanGregg’s that has examples of using DTrace to get the data to feed into his FlameGraph generation scripts, and GolDDranks on Stack Overflow who figured out how to get OSX’s DTrace to find Rust debugging symbols – I didn’t have the same problem, I’m not sure why, but they inspired me to try DTrace instead of Instruments and write an update blog post!

So here’s the new workflow that worked for me!

Prerequisites

These are the same.

  • Turn on debug symbols in your release profile in Cargo.toml
    • This lets the profiling tools match up to your source code.
    • As documented on crates.io, this is done by adding this to your Cargo.toml:

      [profile.release]
      debug = true
      
  • Build with cargo build --release then run ./target/release/[binary]; or use cargo run --release
    • Forgetting to build in release mode (which turns on optimizations) is the #1 cause of unexpected Rust slowness.
    • By default, cargo builds in debug mode, which is faster to compile but runs more slowly.
    • We want to profile the optimized code.

DTrace

OSX (yes yes I know it’s macOS now but it’ll always be OSX to me) comes with dtrace! I’m on Sierra, and dtrace -V says dtrace: Sun D 1.13 for me.

TL;DR: the command I ran to profile my program named zopfli was:

sudo dtrace -c './zopfli large-file' -o out.stacks -n 'profile-997 /execname == "zopfli"/ { @[ustack(100)] = count(); }'

Here’s the explanation of all the pieces:

  • sudo: Because of Reasons™, you have to run dtrace as root on OSX.

  • -c: A lot of the tutorials and blog posts I’m looking at use dtrace by running the program, then finding the PID of the running process, then passing the PID as an argument to dtrace– I don’t want to do all that, I want to have dtrace run my program and watch just what it’s running. Luckily, that’s what -c does! I have a bunch of symlinks set up because Other Reasons™, so the command I use to run my Rust program is ./zopfli large-file. So that’s what I’m doing with the -c './zopfli large-file' part, replace my command with the one you want to run under dtrace.

  • -o: This is the output filename that we’ll be passing to the flamegraphs scripts in the next section.

  • -n: This argument specifies the probe names that you want to enable. Breaking this down:

    • profile-997: interrupt the program 997 times per second. See the profile provider in the DTrace book for more info.
    • /execname == "zopfli"/ is a predicate that will fire if the name of the current process’s executable file is equal to the string we’ve specified; you’ll want to change this to be your program. I’m not sure why this is needed exactly since dtrace should know the executable from the -c argument, but I couldn’t figure out the right syntax to get the -n argument to work without this.
    • { @[ustack(100)] = count(); } is the code to run if the predicate is true. This code stores an aggregate (@[]) count (count()) of the number of times that dtrace sees a particular user stack trace (ustack) 100 frames deep.

FlameGraph

Now that you have the output from dtrace in a file, the flamegraph steps are similar. Once you have, the FlameGraph code, and assuming you gave -o the argument out.stacks, run:

$ ./stackcollapse.pl out.stacks | ./flamegraph.pl > pretty-graph.svg

Tada!!! You have pretty flame graph SVGs again!!!

Rust Profiling with Instruments and FlameGraph on OSX: CPU/Time

Posted on

UPDATE APRIL 2017

I just tried to follow my own directions and found out that Instruments no longer lets you export CSV data that the Time Profiler collects (I’m currently using Instruments Version 8.3 (8E162)). I’m leaving the rest of the post as-is in case it’s useful to anyone still on an older version of Instruments, but any point after the export won’t work, so no pretty flame graphs anymore :( There’s still analysis of performance you can do within Instruments.

Original Post

I recently decided I wanted to figure out a way to measure the performance of the Rust code I’ve been writing and see pretty graphs of what parts of the code were slower. There are a bunch of different ways to do this, and I was going to cover more than one of them, but I am le tired so I decided to post this after writing only about Instruments + FlameGraph.

My somewhat-arbitrary constraints used in picking this method of profiling and graphing:

  • Profile Rust programs’ time (I’d like to profile memory usage too, but that’ll be in another post)
  • In a way that produces graphs
  • On OSX (Linux has other, some may say better, options, but for getting a first look at the performance of my code at all, I don’t feel like booting a VM)

If these constraints are the same as yours, great! Read on! However, there are many more tools that didn’t meet these constraints (or that I didn’t find), so I encourage you to do research specific to your situation as well.

Prerequisites

For any profiling of Rust code on OSX, it’s recommended to do the following:

  • Build with cargo build --release then run ./target/release/[binary]; or use cargo run --release
    • Forgetting to build in release mode (which turns on optimizations) is the #1 cause of unexpected Rust slowness.
    • By default, cargo builds in debug mode, which is faster to compile but runs more slowly.
    • We want to profile the optimized code.
  • Turn on debug symbols in your release profile in Cargo.toml
    • This lets the profiling tools match up to your source code.
    • As documented on crates.io, this is done by adding this to your Cargo.toml:

      [profile.release]
      debug = true
      
  • Use OSX’s system allocator instead of jemalloc

Protip on inlining

I had to consult with Jake Goulding on this one– I was profiling some code that had a function calling another function, and only the inner function was showing up. I couldn’t figure out why. Jake quickly looked at what I was doing and guessed correctly that the compiler was inlining the contents of the outer function as an optimization, so it effectively didn’t exist anymore.

Jake taught me about adding the annotation #[inline(never)] before the outer function, which will tell the compiler to not inline it so that I could see the profiling for it.

Instruments

Instruments is a part of XCode. You’ve already downloaded 2 gigs of XCode, right?

  • Open XCode. Don’t pick anything on the splash screen.
  • From the XCode menu, choose “Open Developer Tool” -> “Instruments”.
  • Choose a blank profiling template. (Or try others! I haven’t explored all the options yet)
  • Click the “+” button in the upper right. Choose “Time Profiler” and drag it to the section in the upper left.
  • In the upper left, after your computer name (yes, my computer’s name is the tableflip emoji), click on where it says “All processes” and click on “Choose Target” in order to choose just your executable instead of all processes. For some reason, populating the list of executables to choose from takes a long time, so don’t be alarmed. Screenshot of Instruments showing the location of the Time Profiler, where you drop it, and where you choose your executable
  • If needed, you can type environment variables and arguments into the “Choose Target” window. Here’s an example where I added some arguments: Screenshot of Choose Target window with some arguments specified
  • By default, the Time Profiler takes a sample every 1 ms. Because I was using a toy example, I found that I wasn’t getting interesting results because my code wasn’t being sampled very often, so I changed the sample interval from 1 ms to 40 microseconds in the lower right. Zoomed in screenshot of the sampling interval set to 40 microseconds
  • Click the record button in the upper left and wait for it to run your code. You may need to give administrator privileges to Instruments. You’ll know it’s done when the button changes back to a record button after being a stop button.
  • You do get a little bit of a graph across the top– From the View menu, choosing “Snap Track to Fit” makes it a little easier to see. It still doesn’t show very specific information though.

Flamegraph

Flamegraph is an awesome collection of perl scripts that can take the output of a few different profiling tools and produce a lovely SVG.

  • To get the data out of Instruments for Flamegraph to use, I followed these instructions which say to:
    • Expand all the nodes in the call tree: Option-click on all the roots to toggle them open
    • From the Instrument menu, choose “Export track” and save the CSV.
  • Download the FlameGraph code and run:

    $ ./stackcollapse-instruments.pl export.csv | ./flamegraph.pl > pretty-graph.svg
    
  • Open the resulting svg in your browser. You now have a pretty graph!!

Here’s an SVG I made from the toy code in this repo.

Happy profiling!!!

Ruby Occurrence Counting

Posted on

Last night, I was coding, as one does, and I wanted something that I thought FOR SURE there was a method in Ruby’s Enumerable module for already, because Enumerable is awesome and The Best Module Ever. But alas, there is not. My problem was:

Given an array like:
arr = ["a", "b", "a", "c", "c"]

You want a count of the occurrences of each item in the
array, as a hash with the unique items in the array as
the keys, and the number of occurrences as the values.

The desired result in this example is:
{"a" => 2, "b" => 1, "c" => 2}

I read through all of enumerable, then tried my hand at a solution, and it just wasn’t pretty or satisfying. This is unusual for Ruby! So what did I do? I asked my coworkers and I asked twitter.

Between all of us, we came up with 12 significantly different solutions:

arr.uniq.map { |x| [x, arr.count(x)] }.to_h
h = Hash.new(0); arr.each { |l| h[l] += 1 }; h
arr.reduce({}) { |m, a| m[a] = m[a].to_i + 1; m }
arr.inject(Hash.new(0)) { |h, i| h[i] += 1; h }
arr.sort.chunk { |ex| ex }.map { |k, v| [k, v.length] }.to_h
arr.reduce({}) { |ret, val| ret[val] = (ret[val] || 0) + 1; ret }
arr.each_with_object(Hash.new(0)) { |word, counts| counts[word] += 1 }
arr.each_with_object({}) { |item, memo| memo[item] ||= 0; memo[item] += 1 }
arr.map { |x| { x => 1 } }.inject { |a, b| a.merge(b) { |k, x, y| x + y } }
arr.group_by { |x| x }.map { |element, matches| [ element, matches.length ] }.to_h
Hash[arr.group_by(&:itself).map {|k,v| [k, v.size] }] # Must also upgrade to Ruby 2.2 or Rails 4
arr.sort.chunk(&:itself).map {|v, vs| [v, vs.count]}.to_h # Must also upgrade to Ruby 2.2 or Rails 4

I love the variety! I love that there’s not a clear right answer! I love that the solution is evolving over newer versions of Ruby (that I’m not on yet)!

And as if this wasn’t Tom Sawyerish enough, I decided we then needed to have a vote. As of this writing, with the polls open about 24 hours, I’ve had 41 responses to these 3 questions:

Results of which do you like best, winners are arr.sort.chunk(&:itself).map {|v, vs| [v, vs.count]}.to_h with 10 votes (24.4%) then arr.uniq.map { |x| [x, arr.count(x)] }.to_h with 9 votes (22%)

Results of which do you think is fastest, winners are h = Hash.new(0); arr.each { |l| h[l] += 1 }; h and arr.sort.chunk(&:itself).map {|v, vs| [v, vs.count]}.to_h, both with 11 votes (26.8%)

Results of which snarky answer did you really want to give, winner was What do you mean, you're not on Ruby 2.2 or Rails 4?	with 9 votes (23.1%)

So then, of course, I had to benchmark the results. And I don’t benchmark things very often, so I don’t know how to do it off the top of my head, but luckily I had Schneems’ Rails pull request open in another tab where someone benchmarked parts of the code using evanphx’s benchmark-ips gem. And a number of things appealed to me in the gem’s readme, namely “No more guessing at random iteration counts!” If there’s anything I know, it’s that I don’t know how to pick iteration counts. The code I used is on GitHub, and I’m sure I messed something up– please send me pull requests if there are improvements to be made!

So I put all the solutions into a Benchmark.ips block, using powers of 10 for the length of the array we’d be counting (this is the n in the Big-O analysis). I used a sampling of lowercase letters in the English alphabet for the values. Benchmarking arrays that are all the same value or all different values instead of a random distribution of values is left as an exercise to the reader.

TL;DR You’re Not Gonna Like The Results

The absolute winner is:

Hash[arr.group_by(&:itself).map {|k,v| [k, v.size] }]

There are two things that I don’t think yinz are going to like about these results:

  1. In my opinion, the differences between the fastest 10 solutions are insignificant. So you’d be fine going with any of those; it’s purely your preference and benchmarking does not give us one unequivocally “best” answer.
  2. The shortest example that 22% of people liked best is significantly slower. Yup, that one looked most elegant to me too! But 7.5x slower isn’t anything to sneeze at.

Here’s the text output of the script I ran. It cuts off because I let it run overnight and it still wasn’t done in the morning, but we did get full runs up to n = 1,000,000.

The last section in each run shows the comparison of the candidates to each other; here’s that section for 1,000,000 items:

Comparison:
     group-by-itself:        6.7 i/s
            group-by:        6.5 i/s - 1.03x slower
           hash-each:        5.1 i/s - 1.29x slower
              inject:        4.6 i/s - 1.44x slower
         reduce-or-0:        4.4 i/s - 1.52x slower
         reduce-to-i:        4.2 i/s - 1.60x slower
  each-with-hash-new:        4.1 i/s - 1.63x slower
               chunk:        4.0 i/s - 1.67x slower
     chunk-by-itself:        3.9 i/s - 1.71x slower
each-with-empty-hash:        3.4 i/s - 1.96x slower
          uniq-count:        0.9 i/s - 7.49x slower
          map-inject:        0.1 i/s - 90.93x slower

You’ll have to look at my code to see which names correspond to each solution.

In order to get the nice, Big-O like graphs that I remember learning all about in college, I took the iterations per second from each N’s comparison section of the results, took the inverse so that I had seconds per iteration, and compiled them in a spreadsheet then made a graph of the data on a logarithmic scale:

Big-O graph showing that most of the solutions are the same, the uniq-count solution is significantly slower, and the map-inject solution is exponential

UPDATE 2015-08-08: Gankro kindly mentioned that I didn’t put the y-axis in logarithmic scale originally, here’s the fixed version. This lets you see more differences between the 10 similarly-performing solutions:

The same data as the previous graph but with the y-axis using a logarithmic scale. There is now some visual difference between some of the 10 fastest solutions.

So there you go! Go forth and benchmark your own code!

Rustcamp Talk Notes: Navigating the Open Seas

Posted on

I gave a talk called “Navigating the Open Seas” at Rustcamp today– loosely based on my last blog post :) Here are links to everything I mentioned in the talk in case you’d like to learn more about these parts of Rust’s history!

Rust Discovery, or: How I Figure Things Out

Posted on

I’m by no means a prolific contributor to the Rust language (yet!), but I have submitted a few patches. One thing that I’m appreciating as I work on these is that it’s possible to figure out why something did (or did not) change since the development process of Rust is happening in the open. Using git, GitHub, the RFCs repo, and documentation, I have felt like I could figure anything out with a little digging. It’s so much more satisfying than languages that are closed source, haven’t documented the decisions made, or have the information more distributed across multiple forums, bug trackers, or mailing lists so that it’s harder to get the full picture.

In programming, I strongly believe that it’s not important to know all the answers off the top of your head, but it is VERY important to know how to figure out the answers you need, and you can make meaningful contributions if you have the latter but most people don’t even try because they feel like they’re supposed to have the former.

I’ve decided to write down my thought processes and the techniques I’ve used while working on a change in order to give some ideas to people who might be feeling overwhelmed or stuck, and to demonstrate that, even though I know very little about how the Rust compiler works, I can help. I’d love to hear about more places/ways I could be looking or about interesting information you’ve found out using these sorts of techniques!

So! Lately, I’ve been working on the grammar documentation, trying to make sure the patterns that explain what is valid in the language are up to date with the code that parses the language since the language is settling down with the 1.0 release impending. The grammar documentation is useful to people writing syntax highlighters, among other things. In this post, I am assuming a general familiarity with EBNF and the variant this Rust document is choosing to use, but I promise they’re essentially a set of rules of substitutions that define what is valid in a language, nothing more. You can totally learn the concepts if you haven’t heard of EBNF before, especially if you’re familiar with regular expressions.

Today, I was working on the Items section and got to the Modules > View items > Use declarations subsection, which at this point in time reads:

use_decl : "pub" ? "use" [ path "as" ident
                          | path_glob ] ;

path_glob : ident [ "::" [ path_glob
                          | '*' ] ] ?
          | '{' path_item [ ',' path_item ] * '}' ;

path_item : ident | "mod" ;

After reading this closely, I noticed what seemed to be like a few issues:

  1. path isn’t defined anywhere (not even in another section in this document)
  2. `use foo{bar}` (without the `::` between `foo` and `{`) would be valid and I don't think it is

    I was totally wrong that the definition allowed this, see the “IN WHICH I REALIZE I’M TOTALLY WRONG” section. I’m leaving this in rather than correcting it because I think it’s important to demonstrate how research can change your understanding of something.

  3. use foo::{mod} (using the keyword mod) would be valid and I don’t think it is; I don’t remember seeing this in code I’ve read and git grep -h use | grep mod in the rust repo doesn’t turn up any use declarations that have mod
The second problem seem to be because of ambiguous grouping. These are valid if the grouping is:

THIS IS TOTALLY WRONG SEE BELOW

ident [ "::" [ path_glob | '*' ] ] ?
|
'{' path_item [ ',' path_item ] * '}'

But not if the grouping is:

ident
  [ "::" [ path_glob | '*' ] ] ?
  |
  '{' path_item [ ',' path_item ] * '}'

(whitespace altered to make the groups a bit clearer)

So now I want to figure out if I’m correct that these are issues (SPOILER I’M NOT SEE BELOW), and then figure out what these definitions should be.

Are these issues?

In trying to figure out if things I find in code are, in fact, problems, I’ve found two questions to be useful to ask:

  • How did this get to be the way it is today? (history)
  • Does it match the current state of the rest of the code? (present)

Let’s start with the history! Git is really good at keeping track of the history of changes. There are a few tools git provides that could be useful in this case:

  • I could use git log src/doc/grammar.md to read the whole history of this file
  • I could use git blame src/doc/grammar.md to see which commit last changed each line in the file
  • I could use the git “pickaxe” search to look for commits that added or removed occurrences of particular text, such as git log -Gpath_glob

Taking a quick look at the log of the file, and especially with the stats that git log --stat src/doc/grammar.md shows, it looks like a lot of the grammar was added to this file in the first commit to it, ffc5f1c. A GitHub trick I like to use is searching for the commit hash to find the pull request this commit was part of:

screenshot showing that if you do a pull request search for a git SHA, you'll get the pull request that commit was part of

This is awesome, because in the PR description it says “The current state of the PR contains all the grammars that were available in reference.md and nothing else.” So this content that looks brand new actually has more history as part of reference.md!

So if I do git log src/doc/reference.md, I get… a lot of commits. git log --format=oneline src/doc/reference.md | wc -l => 356 commits as of today, to be exact. I don’t know about you, but I’m not about to look through all of those individually. I did look at the first commit, though, which shows that THIS file actually used to be src/doc/rust.md, and that the section I’m interested in had the same issues at that point.

screenshot of a paint program that has opened a screenshot of the screenshot of the screenshot to infinity, with the caption WE NEED TO GO DEEPER

If I look at the first commit to src/doc/rust.md, the section I’m interested in is different at this point!

use_decl : "pub" ? "use" ident [ '=' path
                          | "::" path_glob ] ;

path_glob : ident [ "::" path_glob ] ?
          | '*'
          | '{' ident [ ',' ident ] * '}'

Taking a search through the whole document at that version, it seems like path still isn’t defined anywhere, so this version still has issue #1. But use foo{bar} would NOT be valid here because use_decl has a requirement of "::" before the path_glob. Also, path_item with "mod" does not appear. So I’ve got a good range to know when some of the changes I’m interested in have been made!

git log --format=oneline src/doc/rust.md | wc -l shows 179 changes, more manageable but I still don’t want to look through all of them manually. At this point, I’m going to pull out git’s pickaxe search. Usually, you’ll see the concept of a pickaxe search being done using git log -Ssomestring, which will show commits where that string was added or removed from a file, but I more often like git log -Gsomestring, which also shows when lines in a file containing that string appeared in a diff, that is, those lines were changed.

If I do git log -Gpath_glob src/doc/rust.md, I get the two commits that moved this content around AND a commit between them that says “Correct EBNF grammar in the manual”, “The grammar for use declarations was outdated.” This looks relevant to my interests! If I do a GitHub search for the pull requests, I can see some discussion from a year ago about WHY these changes are being made. Most interesting to me are the examples that show with these changes, use {many, idents}; would be correctly valid and use *; would be correctly invalid.

IN WHICH I REALIZE I’M TOTALLY WRONG

Up until this point, I thought the current grammar allowed use foo{bar};. Something about seeing the example of the valid use {many, idents}; that I hadn’t thought of as existing, much less being valid, made me look at the current grammar rules again and realize that use foo{bar}; is actually NOT allowed. If I was going to aim for making use foo{bar}, I’d have follow the path of these substitutions starting from use_decl:

use_decl
"pub" ? "use" [ path "as" ident | path_glob ] // initial expansion
"use" [ path "as" ident | path_glob ] // choose to have 0 "pub"s
"use" path_glob // pick the 2nd arm of the use_decl rule since I don't want "as"
"use" ident [ "::" [ path_glob | '*' ] ] ? // pick the 1st arm of the path_glob rule since I want `foo`

// Important part
"use" ident "::" [ path_glob | '*' ] // choose to have the ? be 1 time, not 0, BAM I have to have `::` in there!!!!

// Continuing with the substitutions to show how it plays out
"use" ident "::" path_glob // pick the 1st arm of the choice I have remaining
"use" ident "::" '{' path_item [ ',' path_item ] * '}' // pick the 2nd arm of the path_glob rule
"use" ident "::" '{' path_item '}' // choose to repeat 0 times
"use" ident "::" '{' ident '}' // pick the 1st arm of the path_item rule
"use" "foo" "::" '{' "bar" '}' // substitute in my idents

I think I was getting confused about where the recursion was happening. Now that I see it this way, it’s hard to see exactly why my brain thought it was a different way (even just 5 minutes ago)! Anyway, I’m not going to spend even a minute feeling bad for my mistake, because it looks like reading and comprehending a grammar definition is hard even for a really smart person :)

SO! I can cross #2 off my list!

Moving right along

Let’s take a look at #3 then. I’m still not so sure about the validity of the mod keyword in this context. But the commit I just found using the pickaxe search looking for #2 wasn’t the source of the introduction of that. So I’m going to try git log -Gpath_item src/doc/rust.md instead. This leads me to one of the commits that moved this content around and a new commit whose message is “Add use a::b::{c, mod}; to the manual”. Cool! There isn’t really much discussion on the pull request this commit came in with, but looking at the full diff, I see it changed some of the surrounding documentation that now lives in reference.md, namely “Simultaneously binding a list of paths differing only in their final element and their immediate parent module, using the mod keyword, such as use a::b::{mod, c, d};” So at the point in time of this commit, this was indeed valid syntax! But what about now? At this point, I’m going to switch from looking at history to looking at the present state. This bullet point does exist in the current reference.md, but it now says “Simultaneously binding a list of paths differing only in their final element and their immediate parent module, using the self keyword, such as use a::b::{self, c, d};” Neat! So at some point, the way to do this was changed from using the mod keyword to using the self keyword and the grammar didn’t keep up with the change! To be absolutely sure, I’m going to look for that change by using git log -G "and their immediate parent module, using the \`self\` keyword, such as" to see when that line appeared in a diff. That gets me 195fd9a, which says “This should have been done together with 56dcbd1”. But wait. If I look at the diff of 195fd9a, it looks like the change to path_item’s definition that I’m trying to figure out if I need to make has already been made!!! What?!

At this point, I will admit that I have some knowledge just from having been paying attention to the changes happening in this area for a bit, but I think this is discoverable using the techniques of following changes around that I’ve been demonstrating. When this separate grammar.md file was introduced, it was created from COPIED content from reference.md, but that content wasn’t REMOVED, and that content evolved separately. Just recently, though, the grammar sections in reference.md were MOVED over to grammar.md and grammar.md is now linked to from reference.md. I’ve just discovered that at least one section that already existed in grammar.md wasn’t overwritten but should have been since the reference had the more recent content, and now I’m really intrigued because there’s a broader situation to look into and verify that there are or are not other sections that need updates, so I’ve potentially discovered a need for a fix that’s more than one line! Not that there’s anything WRONG with a one line change, mind you, just that there is an overhead to each commit/pull request for both the submitter and maintainers, and it’s always good when you can find CLASSES of problems and fix them rather than individual INSTANCES of problems.

I’m going to spare you the boring details of checking each section removed from the reference against what existed in the grammar, but my efforts resulted in this commit; I found 5 spots.

And back for one more

Now, you’ll notice I’ve mostly been avoiding problem #1. That’s because it’s the hardest one– if path has never existed in any form, what is it supposed to be?

Before considering that question, I’m going to check the history a little bit more to be sure this wasn’t a missed rename, since I noted that when doc/rust.md was moved to src/doc/rust.md, the document at that version had this problem already. I started by trying git log -Gpath -- doc/rust.md (I’m not sure why git needs the -- in this one but not the previous commands, but oh well. Maybe because one slash could be a remote/branch but two slashes is clearly a filename?), but that yielded a lot of false positives of prose containing “path”. I also tried git log -G" path[^s]" -- doc/rust.md, that got me fewer, but I decided to just start looking at commit messages and git showing ones that looked potentially related, then using / path in less when looking at the diff.

Then I realized I was being silly– if I’m looking for a potentially-existing-at-some-point definition of path, it would be at the beginning of a line and be followed by a space then a colon (assuming it followed the same conventions as the rest of the file). So that would be git log -G"^path :" -- doc/rust.md. And… nothing. I also tried leaving the filename off; that took a long time to also return nothing. Next I tried git log -Guse_decl -- doc/rust.md, found out a bit about how Rust syntax used to be, but still no clear path definition – everything just referenced the Paths section that only defined expr_path and type_path. I eventually got back to the commit introducing doc/rust.md, which says “Begin shift over to using pandoc, markdown and llnextgen for reference manual”, but doesn’t say shift FROM what. Considering there are no deletions, only additions, of the content I’m interested in, I’m assuming it lived somewhere outside of this repository. Dead end!

Since this post is getting rather long, you’re going to have to tune in next time to see if I just end up adding path : expr_path | type_path ; or if it’s more complicated than that!! Best of luck on your own journeys exploring code :)

TDD Example in Rust

Posted on

This is an example of how I worked through the Prime Factors Kata using Test Driven Development in Rust (specifically rustc 1.0.0-dev 928e2e239 2015-03-25). I’m assuming you’ve read The Rust Programming Language Book through the Testing chapter, so I won’t be explaining much of the Rust syntax.

The kata, as I decided to interpret it, is:

Write a function that takes an integer greater than 1 and returns a vector of its prime factors, which are the prime numbers that divide that integer exactly, in ascending order. For example, the prime factors of 12 are 2, 2, 3.

If you’d like to try it yourself without spoilers, here is where you should stop reading and pick up again when you’re ready!


If you’d like to see the full code at each step that I describe below, take a look at the commits in this repo. Additionally, the output from cargo test at that commit is in each commit message.

So! I generated a new library using cargo new, then started with a failing test:

#[test]
fn prime_factors_of_two() {
    assert_eq!(prime_factors(2), [2]);
}

And, if I run cargo test, it fails to compile:

$ cargo test
   Compiling prime_factors v0.0.1
(file:///Users/carolnichols/Rust/prime_factors)
src/lib.rs:3:16: 3:29 error: unresolved name `prime_factors`
src/lib.rs:3     assert_eq!(prime_factors(2), [2]);
                            ^~~~~~~~~~~~~
<std macros>:1:1: 9:39 note: in expansion of assert_eq!
src/lib.rs:3:5: 3:39 note: expansion site
error: aborting due to previous error
Build failed, waiting for other jobs to finish...
Could not compile `prime_factors`.

In my mind, TDD in Rust has two “red” states– failing to compile and failing tests. I’m in the first kind right now. I don’t necessarily hit both states every time, nor do they always happen in the same order. I feel like I’ve even stopped distinguishing the two in my mind– it doesn’t really matter whether it’s the compiler or the tests that are helping me make my code better. The next step is the same: read the first problem and do the simplest thing you can do to respond to that problem.

The first message here is “error: unresolved name prime_factors”, and the simplest thing I can do is define that function without any arguments and with the unit type return value:

#[allow(dead_code)]
fn prime_factors() {
}

I also decided to add the attribute to override the dead code lint check since I’m not going to have any non-test code calling this function. Silencing that warning will make the test output clearer. So now I get:

src/lib.rs:7:16: 7:32 error: this function takes 0 parameters but 1
parameter was supplied [E0061]
src/lib.rs:7     assert_eq!(prime_factors(2), [2]);
                            ^~~~~~~~~~~~~~~~

Easy enough, I do indeed want the function to take 1 integer parameter:

fn prime_factors(num: i64) {
}

And now I get a warning and a different error:

src/lib.rs:2:18: 2:21 warning: unused variable: `num`,
src/lib.rs:2 fn prime_factors(num: i64) {
                              ^~~
<std macros>:5:24: 5:35 error: mismatched types:
 expected `()`,
    found `[_; 1]`
(expected (),
    found array of 1 elements) [E0308]
<std macros>:5 if ! ( ( * left_val == * right_val ) && ( * right_val ==
* left_val ) ) {
                                      ^~~~~~~~~~~
<std macros>:1:1: 9:39 note: in expansion of assert_eq!

I’m going to let this warning continue to happen for now because I should be using this variable eventually, but I’m not going to address it this step. I’m more concerned with the first error that says in the test, the things I’m trying to assert are equal aren’t the same type. One, the left side, is the unit type, and the right side is an array of one element.

So I’m going to fix this error first by just specifying the return type:

fn prime_factors(num: i64) -> Vec<i64> {
}

which gets me:

src/lib.rs:1:1: 2:2 error: not all control paths return a value [E0269]
src/lib.rs:1 fn prime_factors(num: i64) -> Vec<i64> {

“Not all”? Try “none”, rustc ;) Anyway, let’s get this compiling and the test passing with the simplest implementation:

fn prime_factors(num: i64) -> Vec<i64> {
    vec![2]
}

And I’m green! Woo! There isn’t anything to refactor here, so onto another failing test:

#[test]
fn prime_factors_of_three() {
    assert_eq!(prime_factors(3), [3]);
}

which compiles, but the test fails:

---- prime_factors_of_three stdout ----
	thread 'prime_factors_of_three' panicked at 'assertion failed:
`(left == right) && (right == left)` (left: `[2]`, right: `[3]`)',
src/lib.rs:13

Yep, as I expected. So… the next simplest thing to get both these tests to pass is returning the argument as a vector, which should work for any prime number:

fn prime_factors(num: i64) -> Vec<i64> {
    vec![num]
}

This compiles, passes the tests, and gets rid of those warnings about not using num. Onward!

#[test]
fn prime_factors_of_four() {
    assert_eq!(prime_factors(4), [2, 2]);
}

Aha, something a bit more interesting since 4 isn’t prime. This fails as expected:

---- prime_factors_of_four stdout ----
	thread 'prime_factors_of_four' panicked at 'assertion failed:
`(left == right) && (right == left)` (left: `[4]`, right: `[2, 2]`)',
src/lib.rs:18

I decided to go with a recursive solution here, but I didn’t really like it at this point. It also took me a while to get the syntax for “return the combination of two vectors as a new vector without any variables” to work right, but it turns out + with a reference to another vec works the way I want:

fn prime_factors(num: i64) -> Vec<i64> {
    if num % 2 == 0 && num > 2 {
        vec![2] + &prime_factors(num / 2)
    } else {
        vec![num]
    }
}

There’s a lot of conditionals and a lot of “2”s duplicated around. I could have refactored some of those at this point, but I didn’t. This is where experience and judgement calls come in… I knew this wasn’t the general solution, and I knew I didn’t really like my implementation anyway. Thus, I knew there was going to be a lot of change coming up, and refactoring at this point would potentially do more harm than good and just have to be undone later. But don’t let this become an excuse to not refactor!

If you look at the repo, you’ll see I went ahead and added tests for 5, 6, 7, and 8 that all passed without changing the code. In Uncle Bob’s solution, he skips 5 and 7, ostensibly because he is confident enough in his coverage of prime numbers with tests for 2 and 3 that tests for more primes are no longer interesting. Again, judgement call. It’s also totally ok to write a whole bunch of tests while you’re test driving, but once you understand the domain better, you delete some tests that are covering the same cases before finishing so that you have fewer to wait for to run and fewer that will all break at once if something changes.

And actually, before I ran my test for 6 for the first time, I expected it to fail since 6 isn’t made up of only factors of 2, but my solution was more general than I thought it was!

But 9 is the next interesting test case that fails because it isn’t prime and 2 is not any of its prime factors:

---- prime_factors_of_nine stdout ----
	thread 'prime_factors_of_nine' panicked at 'assertion failed:
`(left == right) && (right == left)` (left: `[9]`, right: `[3, 3]`)',
src/lib.rs:47

So. Because I was feeling a little meh about the recursive solution, I decided to switch to an iterative approach. This means I made 3 things mutable: the argument, a vector to hold the result, and a temporary variable that iterates over potential factors. Now, Rust tries to discourage you from making things mutable by making all variables immutable by default, but that doesn’t mean it’s bad or wrong necessarily. It’s just a bit wordier.

fn prime_factors(mut num: i64) -> Vec<i64> {
    let mut result = vec![];
    while num != 1 {
        let mut i = 2;
        while i <= num {
            if num % i == 0 {
                result.push(i);
                num = num / i;
                break;
            } else {
                i += 1;
            }
        }
    }
    result
}

Now I’m starting to feel like I have a general solution. 9 passes; in the repo you’ll see I added two more tests– one for a larger prime (101 => [101]) and one for a larger non-prime (48 => [2, 2, 2, 2, 3]). Again, these aren’t really necessary since they aren’t covering any cases that aren’t already covered (and they do pass right away). But in this case, they don’t add much time to the test run and they could have caught problems that might not show up with smaller numbers. It all depends.

Refactoring time! I wanted to see if I could get rid of the mutable i variable by switching the while loop to be a for loop over a range, since that’s all it’s doing anyway.

fn prime_factors(mut num: i64) -> Vec<i64> {
    let mut result = vec![];
    while num != 1 {
        for i in 2..num + 1 {
            if num % i == 0 {
                result.push(i);
                num = num / i;
                break;
            }
        }
    }
    result
}

Still passes tests, looks a little better. I still wasn’t happy about it though, since I read in Uncle Bob’s description that he was able to solve the kata in 3 lines of code. I couldn’t spot anything I could change, so at this point I gave up and looked at his solutions in his slides. Now, he did it in Java and his solution was:

for (int candidate = 2; n > 1; candidate++)
    for (; n%candidate == 0; n/=candidate)
        primes.add(candidate);

And now it makes sense. This feels a bit too clever, though, with the for loop conditions not matching the iterator index. It’s neat code golf though. I couldn’t get it down quite this small since Rust doesn’t have this for loop syntax, but this is probably the closest translation and is better than what I ended up with in that it’s not going back out to the outer loop if we can continue to divide by the candidate factor over and over:

let mut result = vec![];
let mut i = 2;
while num > 1 {
    while num % i == 0 {
        result.push(i);
        num /= i;
    }
    i += 1;
}
result

But is this the most idiomatic Rust solution? It didn’t really feel like it, with all those muts still in there. So at this point I decided to give the recursive solution a try again:

fn prime_factors(num: i64) -> Vec<i64> {
    for i in 2..num {
        if num % i == 0 {
            return vec![i] + &prime_factors(num / i)
        }
        i += 1;
    }
    vec![num]
}

This feels better to me, after having explored the iterative solutions. What do you think? Again, there’s no one right answer :)

I’ve been meaning to learn how to, and write a post on, how to benchmark Rust and analyze the runtime and memory usage, perhaps I’ll use these different implementations as material for that :)

Dev URLs without Pow

Posted on

TL;DR: I had to do a bunch of research in order to set up pretty URLs to apps running on my localhost at a particular port, so I decided to write up the steps I followed in one place in case anyone else wants to do this. Scroll down past the reasons I wanted to do this for the instructions.

I recently decided to use a non-admin user with OSX Yosemite on a daily basis, mostly on the recommendation of a security researcher who discovered a vulnerability in OSX, and because it sounds like a good idea in general.

The only problem I’ve encountered so far is in installing Pow– the script to install it wants sudo, but it also wants to be run as the current user. There are instructions on the wiki for installation as a standard user, but they appear to be out of date; I got errors when trying to use npm run-script. I also got errors when trying to install it from source. So ¯\_(ツ)_/¯.

The thing is, I was only using Pow for its pretty URL creation anyway; I use foreman or grunt to run my apps instead of Pow’s rack server. So I set out to get URLs like http://my-project.dev to serve up a rails app that I would usually access at http://localhost:3000.

My first thought was that I could just use /etc/hosts, but that only works for host names, not host names and ports. Luckily, Apache can do this and, conveniently enough, is included with OSX!

BUT WAIT! I do still need to use /etc/hosts, so that the host goes to localhost where Apache will handle it.

Instructions

So, to spell it all out, I logged in as my admin user (since I’m running from a non-admin day-to-day now) and edited /etc/hosts using sudo:

$ login admin
password:
admin$ sudo vim /etc/hosts

In order to add a new entry for my-project.dev. So now, my /etc/hosts file looks like:

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1	localhost
255.255.255.255	broadcasthost
::1             localhost

127.0.0.1       my-project.dev

At this point, still as the admin user, I started up apache by doing:

admin$ sudo apachectl start

And then I could go to http://my-project.dev and see the Apache “It works!” page. We’re halfway there!

Now, edit /etc/apache2/httpd.conf, find this part about Virtual Hosts and uncomment the Include line so it looks like this:

# Virtual hosts
Include /private/etc/apache2/extra/httpd-vhosts.conf

Then, edit THAT file, /private/etc/apache2/extra/httpd-vhosts.conf, remove the two sample <VirtualHost>s, and insert a new one that looks like this, as recommended by this ServerFault answer:

NameVirtualHost *:80

<VirtualHost *>
    ServerName my-project.dev
    ProxyPreserveHost On

    # setup the proxy
    <Proxy *>
        Order allow,deny
        Allow from all
    </Proxy>
    ProxyPass / http://localhost:3000/
    ProxyPassReverse / http://localhost:3000/
</VirtualHost>

Then restart apache:

admin$ sudo apachectl restart

And, if you’ve got your app running at port 3000, you should be able to visit http://my-project.dev in your browser and get to your app!

Ruby, Rust, and Concurrency

Posted on

I mostly do Ruby these days, but I’ve just recently started getting into Rust. I’m finding it super interesting! It’s definitely expanding the way I think about programming, and I’m excited about the future it’s foreshadowing.

I’m a visual and experiental learner though, so when I see statements like “Race conditions Data races are compile-time errors” (thank you for the clarification, dbaupp :)) I wonder what that looks like in the code, what kind of error message you get, and how that differs from what I know in Ruby.

So I went out hunting for a trivial race condition in Ruby that I could port to Rust to see what would happen. Stack overflow to the rescue! Here’s a slightly modified version of the Ruby code in that post that suited my purposes better. Note that this is totally contrived and you would never actually want to have a bunch of threads help you increment a number from 1 to 500:

THREADS = 10
COUNT   = 50

$x = 1

THREADS.times.map { |t|
  Thread.new {
    COUNT.times { |c|
      a  = $x + 1
      sleep 0.000001
      puts "Thread #{t} wrote #{a}"
      $x =  a
    }
  }
}.each(&:join)

if $x != THREADS * COUNT + 1
  puts "Got $x = #{$x}."
  puts "Expected to get #{THREADS * COUNT + 1}."
else
  puts "Did not reproduce the issue."
end

This causes a race condition most of the time that I run it in Ruby 2.0– I’ve got a global variable that the threads read, then they sleep, then they increment, and in the time that one thread sleeps, another thread reads and/or writes that global. Stuff happens all out of order. Here’s parts of the output from one of the times I ran it:

Thread 0 wrote 2
Thread 0 wrote 3
...
Thread 1 wrote 78
Thread 3 wrote 29
...
Thread 3 wrote 59
Thread 4 wrote 29
...
Thread 4 wrote 78
Thread 3 wrote 60
...
Thread 0 wrote 50
Thread 0 wrote 51
Got $x = 51.
Expected to get 501

So what would it look like if I tried to do this in Rust, as faithfully as possible to the contrived example? Here’s my attempt, critiques welcome:

use std::io::timer::sleep;

fn main() {
  let threads = 10;
  let count   = 50;

  let mut x = 1u;
  for num in range(0u, threads) {
    spawn(proc() {
      for num in range(0u, count) {
        let a = x + 1;
        sleep(1);
        x = a; // error: cannot assign to immutable
               // captured outer variable in a proc `x`
      }
    });
  }

  println!("The result is {}", x);
}

Sure enough, if I try to compile this with rust 0.11.0, I get a compile error! Cool! I’ve marked its location and text with a comment. This error says that the compiler captures x and makes it immutable while in the proc. I can write to it all I want in main where it’s declared as mutable, but because I could create race conditions if I was allowed to write to it in a spawned thread, Rust doesn’t let me do that.

I tried to make a version of this program in Rust done The Right Way that would compile and run, but a lot of what I’m seeing is basically saying “don’t do that” to me (and this is a super contrived example). I tried a little bit with mutexes, a little with futures, but they ended up changing the logic happening so that one thread was locking the value for its whole existence, then the next thread would lock it, etc. I never got it to compile, run, and illustrate everything I wanted it to, so I’m leaving that as an exercise for the reader :) This experiment in fork/join parallelism in Rust looks relevant to my interests but I haven’t digested it all yet.

Hope this helps someone else to understand the different choices that Ruby and Rust make in this area! I’d love to hear what you think, how I could make this example better, or how you’d implement this example to compile and run in Rust :) <3

On Stack Overflow

Posted on

On Dec 24, 2013, Michael Richter wrote a post about why he no longer contributes to Stack Overflow. I don’t feel the same way as he does, but it made me do a lot of thinking and, eventually, a lot of writing. I left these thoughts as a comment on his post but I decided to make my own post as well. Here they are, slightly modified from their comment form to make a bit more sense standing on their own.

Background: I’ve got a little over 3k in Stack Overflow points currently and I mostly read, ask, and answer questions in the [ruby] and [ruby-on-rails] topic tags.

The reason I think Michael’s post hit me so hard is that in the community organizing stuff I do, I’ve been trying to be very deliberate about making welcoming environments for everyone. So if there are people who don’t feel that Stack Overflow is a welcoming environment, I’m interested in seeing what we can do to change that, and possibly extrapolating the results to my other communities too.

Michael mentions that answers written hastily after a quick google search often get more points than answers that took a long time to write. He also notes that he’s earned thousands of points while being inactive. I agree that SO doesn’t directly incentivize thoughtful answers to complex questions. To me, though, it’s much more about whether I helped someone or not (as Jay Hanlon mentioned in the comments). I enjoy knowing that some content I put on the internet a few years ago is still helping someone today. I would hate to lose that feedback, but I agree it doesn’t necessarily make sense to tie so much information to the one reputation number. Perhaps privileges should be separated more from votes, but I don’t know what would be a better fuzzy metric for “this person knows about these topics and cares about the quality of the content”.

The other useful thing about votes on answers is to indicate which answers are better than others, both from the point of view of the original asker and from the point of view of future searchers with the same question. I wonder if this could be fixed with, instead of showing the raw count of upvotes per answer, showing a percentage of votes a particular answer got out of all the votes cast on the answers for that question. This would be a pretty radical change for SO though…

Michael also noticed that, after he posted a controversial answer, people were downvoting all of his answers. I’m not sure when this was implemented, but Stack Exchange actually has detection for serial voting (both up and down) in place and I’ve seen the system automatically reverse some serial downvoting of my answers within 24 hours.

Michael has been feeling a creeping authoritarianism in the culture of Stack Overflow lately. I agree that the culture of SO has changed, but I don’t remember exactly when, why, or how it happened (my SO activity has waxed and waned a few times). Today, I would probably vote to close some of the questions I asked a few years ago, but as far as I know they were within the site guidelines for good questions at the time. I’m a little confused when putting this section together with the poor pedagogy section though– many of the situations you cite there of people not attempting to solve their own problems first are handled by moderating them as duplicates, needing more information about what the asker has tried and why that didn’t work, etc. Which would be more authoritarianism, but this section seems like you’d prefer less. I’m just not sure if it’s possible to go back to the way SO used to be now that it’s at this scale.

I definitely think SO could use improvement in the experience of a new user. As pointed out by the commenter Mike and others, not being able to comment unless you’ve gotten 50 points is incredibly frustrating and off-putting, and I’ve seen it cause poor content to be created (answers that start with “I can’t comment on X but…”). Yet another case where reputation is being used for multiple purposes– in this case I think mostly to prevent spam and sock puppets.

It’s also really hard to explain what makes a question high quality for this format and guide a new user towards providing the information that will help them get the information they’re looking for fastest. The current process of closing questions, even with the close reasons given to the asker as feedback, does feel very authoritarian and unwelcoming. This would require much more effort, but perhaps instead of closing, they could be put in a “mentoring” queue that has more affordances for back-and-forth discussion than the main Q&A format. Maybe that’s what chat is for and more questions need to be moved there? I don’t really use chat much.

One final comment I’d like to make is that it’s really tough to paint “the SO community” with one brush, given the number of users and breadth of subtopics. As prolific as he is, I actually rarely see any answers from Jon Skeet because I don’t do any C# and he doesn’t do any Ruby. There are definitely subcultures in SO (although I’d be hard pressed to define them). I suppose my point here is that I’m not convinced that the structure of SO is entirely responsible for the “community” aspect since the same structure has enabled different cultures to exist (although it’s definitely a large factor).

Michael, thank you again for sharing your thoughts– I appreciate by your “poor community counter” that this isn’t easy.

And we're back

Posted on

Sigh. My wordpress blog apparently got hacked, my host shut it down, so I’ve spent the last few days converting this blog to a Jekyll site hosted on github pages. I really can’t recommend wordpress to anyone anymore, given the massive botnets that are attacking every wordpress site out there right now. For non-technical folks or lazy technical people who don’t want to have to spend time keeping up with updates or fighting off attacks, it’s just not a viable solution.

So here we are. I still have a lot of work to do… the links between posts are all broken, I had some google juice on some of the posts at least, so it’d be nice if links to my old posts still redirected. I haven’t decided whether to add commenting via disqus or similar– I like hearing that my posts have helped people, but do I really need them? They’re really just another avenue for spam. I’m also not wild about this theme, it’s nice enough, but it’s just not really me (but I also don’t have the time/talent to make something I like better).

Oh, and it’s a good thing I’m not very prolific because I ended up converting my posts by hand. I had a sql dump file, not a wordpress xml export, and apparently the jekyll-import gem only supports connecting to a database or importing from the wp xml file. And I had enough problems just getting the jekyll import process to tell me that that I didn’t exactly have much confidence that the import process was going to go well anyway.

Technology sucks. Everything’s broken.

← Newer Page 1 of 4