Ruby, Rust, and Concurrency

Posted on

I mostly do Ruby these days, but I've just recently started getting into Rust. I'm finding it super interesting! It's definitely expanding the way I think about programming, and I'm excited about the future it's foreshadowing.

I'm a visual and experiental learner though, so when I see statements like "Race conditions Data races are compile-time errors" (thank you for the clarification, dbaupp :)) I wonder what that looks like in the code, what kind of error message you get, and how that differs from what I know in Ruby.

So I went out hunting for a trivial race condition in Ruby that I could port to Rust to see what would happen. Stack overflow to the rescue! Here's a slightly modified version of the Ruby code in that post that suited my purposes better. Note that this is totally contrived and you would never actually want to have a bunch of threads help you increment a number from 1 to 500:

THREADS = 10
COUNT   = 50

$x = 1

THREADS.times.map { |t|
  Thread.new {
    COUNT.times { |c|
      a  = $x + 1
      sleep 0.000001
      puts "Thread #{t} wrote #{a}"
      $x =  a
    }
  }
}.each(&:join)

if $x != THREADS * COUNT + 1
  puts "Got $x = #{$x}."
  puts "Expected to get #{THREADS * COUNT + 1}."
else
  puts "Did not reproduce the issue."
end

This causes a race condition most of the time that I run it in Ruby 2.0-- I've got a global variable that the threads read, then they sleep, then they increment, and in the time that one thread sleeps, another thread reads and/or writes that global. Stuff happens all out of order. Here's parts of the output from one of the times I ran it:

Thread 0 wrote 2
Thread 0 wrote 3
...
Thread 1 wrote 78
Thread 3 wrote 29
...
Thread 3 wrote 59
Thread 4 wrote 29
...
Thread 4 wrote 78
Thread 3 wrote 60
...
Thread 0 wrote 50
Thread 0 wrote 51
Got $x = 51.
Expected to get 501

So what would it look like if I tried to do this in Rust, as faithfully as possible to the contrived example? Here's my attempt, critiques welcome:

use std::io::timer::sleep;

fn main() {
  let threads = 10;
  let count   = 50;

  let mut x = 1u;
  for num in range(0u, threads) {
    spawn(proc() {
      for num in range(0u, count) {
        let a = x + 1;
        sleep(1);
        x = a; // error: cannot assign to immutable
               // captured outer variable in a proc `x`
      }
    });
  }

  println!("The result is {}", x);
}

Sure enough, if I try to compile this with rust 0.11.0, I get a compile error! Cool! I've marked its location and text with a comment. This error says that the compiler captures x and makes it immutable while in the proc. I can write to it all I want in main where it's declared as mutable, but because I could create race conditions if I was allowed to write to it in a spawned thread, Rust doesn't let me do that.

I tried to make a version of this program in Rust done The Right Way that would compile and run, but a lot of what I'm seeing is basically saying "don't do that" to me (and this is a super contrived example). I tried a little bit with mutexes, a little with futures, but they ended up changing the logic happening so that one thread was locking the value for its whole existence, then the next thread would lock it, etc. I never got it to compile, run, and illustrate everything I wanted it to, so I'm leaving that as an exercise for the reader :) This experiment in fork/join parallelism in Rust looks relevant to my interests but I haven't digested it all yet.

Hope this helps someone else to understand the different choices that Ruby and Rust make in this area! I'd love to hear what you think, how I could make this example better, or how you'd implement this example to compile and run in Rust :) <3

On Stack Overflow

Posted on

On Dec 24, 2013, Michael Richter wrote a post about why he no longer contributes to Stack Overflow. I don't feel the same way as he does, but it made me do a lot of thinking and, eventually, a lot of writing. I left these thoughts as a comment on his post but I decided to make my own post as well. Here they are, slightly modified from their comment form to make a bit more sense standing on their own.

Background: I've got a little over 3k in Stack Overflow points currently and I mostly read, ask, and answer questions in the [ruby] and [ruby-on-rails] topic tags.

The reason I think Michael's post hit me so hard is that in the community organizing stuff I do, I've been trying to be very deliberate about making welcoming environments for everyone. So if there are people who don't feel that Stack Overflow is a welcoming environment, I'm interested in seeing what we can do to change that, and possibly extrapolating the results to my other communities too.

Michael mentions that answers written hastily after a quick google search often get more points than answers that took a long time to write. He also notes that he's earned thousands of points while being inactive. I agree that SO doesn't directly incentivize thoughtful answers to complex questions. To me, though, it's much more about whether I helped someone or not (as Jay Hanlon mentioned in the comments). I enjoy knowing that some content I put on the internet a few years ago is still helping someone today. I would hate to lose that feedback, but I agree it doesn't necessarily make sense to tie so much information to the one reputation number. Perhaps privileges should be separated more from votes, but I don't know what would be a better fuzzy metric for "this person knows about these topics and cares about the quality of the content".

The other useful thing about votes on answers is to indicate which answers are better than others, both from the point of view of the original asker and from the point of view of future searchers with the same question. I wonder if this could be fixed with, instead of showing the raw count of upvotes per answer, showing a percentage of votes a particular answer got out of all the votes cast on the answers for that question. This would be a pretty radical change for SO though...

Michael also noticed that, after he posted a controversial answer, people were downvoting all of his answers. I'm not sure when this was implemented, but Stack Exchange actually has detection for serial voting (both up and down) in place and I've seen the system automatically reverse some serial downvoting of my answers within 24 hours.

Michael has been feeling a creeping authoritarianism in the culture of Stack Overflow lately. I agree that the culture of SO has changed, but I don't remember exactly when, why, or how it happened (my SO activity has waxed and waned a few times). Today, I would probably vote to close some of the questions I asked a few years ago, but as far as I know they were within the site guidelines for good questions at the time. I'm a little confused when putting this section together with the poor pedagogy section though-- many of the situations you cite there of people not attempting to solve their own problems first are handled by moderating them as duplicates, needing more information about what the asker has tried and why that didn't work, etc. Which would be more authoritarianism, but this section seems like you'd prefer less. I'm just not sure if it's possible to go back to the way SO used to be now that it's at this scale.

I definitely think SO could use improvement in the experience of a new user. As pointed out by the commenter Mike and others, not being able to comment unless you've gotten 50 points is incredibly frustrating and off-putting, and I've seen it cause poor content to be created (answers that start with "I can't comment on X but..."). Yet another case where reputation is being used for multiple purposes-- in this case I think mostly to prevent spam and sock puppets.

It's also really hard to explain what makes a question high quality for this format and guide a new user towards providing the information that will help them get the information they're looking for fastest. The current process of closing questions, even with the close reasons given to the asker as feedback, does feel very authoritarian and unwelcoming. This would require much more effort, but perhaps instead of closing, they could be put in a "mentoring" queue that has more affordances for back-and-forth discussion than the main Q&A format. Maybe that's what chat is for and more questions need to be moved there? I don't really use chat much.

One final comment I'd like to make is that it's really tough to paint "the SO community" with one brush, given the number of users and breadth of subtopics. As prolific as he is, I actually rarely see any answers from Jon Skeet because I don't do any C# and he doesn't do any Ruby. There are definitely subcultures in SO (although I'd be hard pressed to define them). I suppose my point here is that I'm not convinced that the structure of SO is entirely responsible for the "community" aspect since the same structure has enabled different cultures to exist (although it's definitely a large factor).

Michael, thank you again for sharing your thoughts-- I appreciate by your "poor community counter" that this isn't easy.

And we're back

Posted on

Sigh. My wordpress blog apparently got hacked, my host shut it down, so I've spent the last few days converting this blog to a Jekyll site hosted on github pages. I really can't recommend wordpress to anyone anymore, given the massive botnets that are attacking every wordpress site out there right now. For non-technical folks or lazy technical people who don't want to have to spend time keeping up with updates or fighting off attacks, it's just not a viable solution.

So here we are. I still have a lot of work to do... the links between posts are all broken, I had some google juice on some of the posts at least, so it'd be nice if links to my old posts still redirected. I haven't decided whether to add commenting via disqus or similar-- I like hearing that my posts have helped people, but do I really need them? They're really just another avenue for spam. I'm also not wild about this theme, it's nice enough, but it's just not really me (but I also don't have the time/talent to make something I like better).

Oh, and it's a good thing I'm not very prolific because I ended up converting my posts by hand. I had a sql dump file, not a wordpress xml export, and apparently the jekyll-import gem only supports connecting to a database or importing from the wp xml file. And I had enough problems just getting the jekyll import process to tell me that that I didn't exactly have much confidence that the import process was going to go well anyway.

Technology sucks. Everything's broken.

Wordpress 3.5.1 multisite subdirectory problem

Posted on

I recently created a new wordpress 3.5.1 install with the intention of enabling multisite. I followed all the official instructions and everything seemed to be working until I got to actually creating a second site in the network with a subdirectory.

I created the site (ex: named 'blah'), but when I tried to go to domain.com/blah or domain.com/blah/wp-admin, it was like I was looking at the main site in the network.

I tried many different recommended .htaccess variations, checked every gotcha that official wordpress documentation, the wordpress stackexchange, and other wordpress blogs mentioned, to no avail.

Finally it took some help from my life pair Jake Goulding to spot that in the site info for the subsite, I had '/blah' for the path setting, and since the main site had '/' as the path, perhaps it should be '/blah/'.

Adding the trailing slash in the path for the site allowed wordpress to recognize it and going to domain.com/blah/wp-admin immediately started working.

I admittedly am not very experienced with wordpress, so perhaps this is something everyone else was able to figure out, especially since I didn't see anyone else recommending that you check this.

But I found the help text in the Add a Site form to be misleading:

It says "Only lowercase letters (a-z) and numbers are allowed." To me, that says "You should not enter a trailing slash", when in fact the behavior I see is that you MUST enter a trailing slash. Either the help text should be changed or, even better, this should work whether you enter a trailing slash or not.

Enable SSL with Heroku for https access

Posted on

For rstat.us, we've been meaning to enable ssl/https for a long time. I'm pleased to announce that we've finally gotten around to it, and everything seems to be working now!

What was the hold up? On heroku (where we started hosting rstat.us, then we weren't, now we are again), it used to cost $100/mo for the ssl hostname addon. I love rstat.us, but not that much. Recently, however, heroku came out with the SSL Endpoint addon which is only $20/mo. This is much more within my budget :)

I also wanted to make sure that the Certificate Authority we went with was in line with rstat.us' values. This is a project that started on the values of openness and simplicity, so I agonized a bit over our choice of CA. We ended up going with a free certificate from StartSSL for the reasons I mentioned in that thread.

Once you have a certificate, the instructions for the SSL Endpoint are pretty good. There are a few places that I either had issues with or think could use some further clarification:

Intermediate certs

In the SSL endpoint docs, there's a part that says "If you have uploaded a certificate that was signed by a root authority but you get the message that it is not trusted, then something is wrong with the certificate. For example, it may be missing intermediary certificates." This doesn't say what you need to do to add the intermediary certificates that you may be missing.

Some certificate authorities will have intermediate certificates that have to be included in your .crt file. The "Purchasing an SSL Certificate" heroku docs explain how to cat your certificate with a root certificate. If you have an intermediate certificate file as well, first cat your certificate as they show in the instructions, then cat the intermediate certificate, then cat the root certificate. I figured this out by trying it and it worked, but I felt unsure since the docs from heroku and the CA didn't explicitly say what order they should go in.

Add vs Edit CNAME

This may be obvious to everyone but me, but I thought I'd mention it anyway in case someone else has the same brain fart I did. The SSL endpoint docs say "Next, add a CNAME record in the DNS configuration that points from the domain name that will host secure traffic e.g. www.mydomain.com to the SSL endpoint hostname, e.g. tokyo-2121.herokussl.com."

So I did exactly that-- added a CNAME record. But we already HAD a CNAME record for www.rstat.us since we had already followed the heroku custom domains instructions. And CNAMES aren't additive, hah. So yeah, if you already have a CNAME for the domain name you're trying to enable ssl with, just edit that one.

Naked domain

This is the part I struggled with for a week. I got SSL all working for www.rstat.us according to the SSL endpoint docs-- https://www.rstat.us worked great! I also had a 301 URL Redirect set up in our DNS records with NameCheap to redirect the naked domain rstat.us to www. This may be unpopular, but naked domains can't be CNAMES and using A records for the naked domain is highly discouraged for availability reasons (see the naked domain section).

However, 301 redirects don't care about the protocol-- if you go to http://rstat.us, it redirects you to http://www.rstat.us, and if you go to https://rstat.us it (supposedly) redirects you to https://www.rstat.us. What I was seeing, though, if you went to https://rstat.us, was that the SSL handshake would just hang. This was very visible when doing a curl -v https://rstat.us on the command line-- it would just get to the SSL handshake and wait forever.

The certificate was good for both rstat.us and www.rstat.us, so that wasn't the problem. It looks like an ordering issue to me (correct me if I'm wrong)-- request https://rstat.us, do the SSL handshake since it's SSL, and THEN do whatever the DNS says to do-- but the CNAME with www that makes SSL work is AFTER the 301 redirect specified in the DNS settings for the naked domain.

Heroku's documented solution to this is "only circulating and publicizing the subdomain format of your secure URL." It is left as an exercise for the reader to determine why this is an unacceptable solution.

Not mentioned in that page, but linked in its references, is a blog post by DNSimple introducing ALIAS records. This is something that, as far as I can tell, only DNSimple has implemented, and it basically turns a CNAME into an A record behind the scenes without any need for user intervention should the IP addresses that the CNAME resolves to change.

This StackOverflow question confirmed that the DNSimple ALIAS record would solve the problem I was having-- so I set up a DNSimple account ($3/mo for up to 10 domains at the moment, but use my referral link and we both get a month free!), set up the following records:

ALIAS   rstat.us    www.rstat.us
CNAME   www.rstat.us    toyama-8790.herokussl.com

and changed the settings with NameCheap, where we have rstat.us registered, to use DNSimple's DNS instead of NameCheap's. Indeed, the ALIAS record magically makes all variations of http/https and naked domain/www work as desired!

Can I have my devops merit badge now?

Interesting Times

Posted on

So today I put in my 2 weeks' notice at Careerimp-- for a variety of reasons that I'm not going to go into publicly, I no longer think the position is the right thing for me at this point in my career.

So what is the right thing? I don't know yet :D I feel very lucky to be in a position where I can afford to take a few months off, and I feel lucky to be fairly certain that I have several options open to me.

In my first month, my main goals are:

  • Put some focused work into rstat.us
  • Work on my house (one of my longest-lasting contributions to this world may very well be making this house able to stand for another hundred years!)
  • Cook new things that use all the fresh vegetables we're getting from a farm share before they go bad
  • Figure out goals for the month after that :D

And of course there's Steel City Ruby! I'll now have the week before the conference to concentrate on making it super awesome, and that I'm definitely excited about :D

Thank you to those of you who have been advising me lately-- you know who you are. <3 You helped make this a lot less scary for me.

Updating gems in rvm's global and default gemsets

Posted on

This is mostly for my own reference; I know I've looked this up before and it's always hard to find, so at least now I'll have one place to look.

If you want to update gems that you have in your default or global gemsets because you see 2 versions of, say, bundler when you do gem list, you can install and uninstall from them by using a command like this (including the parentheses):

(rvm use @global; gem uninstall -x bundler)

Switching global and default as necessary, and uninstall and install, and bundler and rake (the 2 gems I get in this situation with).

UPDATE: Just talked with Michal Papis (@mpapis) in IRC and he recommends using:

rvm @global do gem uninstall -x bundler

which looks much nicer. As an aside, I was having an issue yesterday and Michal fixed it by the time I got into work today! Thank you so much for your help and work on rvm, Michal!!

Sinatra:Base rackup: rubyeventmachine.bundle: [BUG] Segmentation fault ruby 1.8.7

Posted on

I was just saying yesterday that I haven't run into any strange errors lately. I guess I forgot to knock on wood!

Today I started a brand new sinatra app. I haven't written my own sinatra app from scratch before, so I'm copying pieces from some other apps that I have cloned on my machine. Here's what my code looked like:

# Gemfile
source :rubygems

gem 'sinatra'
gem 'thin'
# salmon_test.rb
require 'sinatra'

class SalmonTest < Sinatra::Base
  get "/" do
    "hello"
  end
end
# config.ru
require './salmon_test'
run SalmonTest

Then I bundled and ended up with this Gemfile.lock:

GEM
  remote: http://rubygems.org/
  specs:
    daemons (1.1.8)
    eventmachine (0.12.10)
    rack (1.4.1)
    rack-protection (1.2.0)
      rack
    sinatra (1.3.2)
      rack (~> 1.3, >= 1.3.6)
      rack-protection (~> 1.2)
      tilt (~> 1.3, >= 1.3.3)
    thin (1.3.1)
      daemons (>= 1.0.9)
      eventmachine (>= 0.12.6)
      rack (>= 1.0.0)
    tilt (1.3.3)

PLATFORMS
  ruby

DEPENDENCIES
  sinatra
  thin

Also note that I'm using rvm with ruby 1.9.2-p290 and a brand new gemset for this project.

When I ran rackup to start the server, I got this error message:

$ rackup
~/.rvm/gems/ruby-1.9.2-p290@salmon_test/gems/eventmachine-0.12.10/lib/rubyeventmachine.bundle: [BUG] Segmentation fault
ruby 1.8.7 (2010-01-10 patchlevel 249) [universal-darwin10.0]

Abort trap

Why is it doing something with 1.8.7??? Who knows! Right after that I did an rvm list:

$ rvm list

rvm rubies

   ruby-1.8.7-p358 [ i686 ]
=* ruby-1.9.2-p290 [ x86_64 ]
   ruby-1.9.2-p318 [ x86_64 ]
   ruby-1.9.3-p125 [ x86_64 ]

Yep, using 1.9.2...

The one thing I changed in the code before running rackup again was in salmon_test.rb:

- require 'sinatra'
+ require 'sinatra/base'

Then the next time I ran rackup everything worked fine. shrug Hope this helps someone else.

Google Chrome AJAX POST request: net_error = -100 (CONNECTION_CLOSED)

Posted on

I haven't totally figured this out yet. I'm probably doing something dumb.

I have a rails application with an action that I'm hitting via an AJAX POST request. This action generates a PDF and saves it on the server, then sends an email using an ActionMailer, then responds with either a success message or an error message.

Everything was working just hunky-dory except in Google Chrome. In Chrome, at first, it looked like the AJAX request wasn't even being fired because I wasn't seeing it in the console like I'm used to in Firebug's console. Eventually I found where it was hiding-- in the Net tab :P

There it didn't look like there were any errors except that in the Status column it said "(canceled)". WTF?!? I wasn't canceling it, the rails server said it was handling it and returning the response I was sending just like normal, so what was going on???

By clicking on the request in the Net tab, or by opening chrome://net-internals/ in a new tab and going to the Events tab, I finally noticed this in the response headers:

    Content-Transfer-Encoding: binary
    Content-Type: text/html
    Content-Disposition: inline; filename="pdf_141.pdf"
    Content-Length: 5274

The headers from the PDF generation are bleeding out into my AJAX response for some reason!! I'm still not sure why this is happening. Then I think Chrome sees that the actual data doesn't match the headers and just stops processing.

Once I changed my rails code from:

    render :text => "OK"

to:

    send_data "OK", :type => "text/plain"

aside: yes, i know i fail REST forever

Then the AJAX request happened as I expected it to, and the headers had Content-type: text/plain. sigh.

Selenium::WebDriver::Error::UnhandledError (NS_ERROR_ILLEGAL_VALUE)

Posted on

Somewhere around the time that I got a Mac, upgraded Firefox, upgraded selenium-webdriver, upgraded Capybara and started running lots of acceptance tests in selenium, I started getting errors like this:

Selenium::WebDriver::Error::UnhandledError: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIDOMXPathEvaluator.createNSResolver]

They don't appear to correlate with anything particular that my tests are doing-- I can run the same tests again immediately and have different tests get this error, or have no tests get this error at all. Like all good Heisenbugs, it was especially hard to trigger when I sat down to dig into this!!!

I did the usual googling and the most helpful thing I found was this Capybara mailing list post explaining different ways of debugging Selenium UnhandledErrors.

I didn't see anything initially useful in the output when running with $DEBUG on.

Dumping the firefox console log to disk (this post was clearer to me than the capybara docs on how to do this) showed this error that looked like what I saw:

nsCommandProcessor.js:314 - Exception caught by driver: findElements([object Object])
[Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIDOMXPathEvaluator.createNSResolver]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource://fxdriver/modules/atoms.js :: <TOP_LEVEL> :: line 2354"  data: no]

So then I searched for "firefox nsCommandProcessor.js:314 - Exception caught by driver" and found this selenium issue. Which says it's a timing issue in Selenium between get and findElement. I don't know right now if there's a way to sleep in my tests to solve it.

And now i've hit a timing issue that I know sleep will cure :D Good night-- I might look into a workaround more tomorrow.

My system:

  • OSX 10.6.8 (no, I haven't upgraded to Lion yet :P)
  • Ruby 1.8.7-p302
  • Firefox 5.0.1
  • selenium-webdriver 2.0.1
  • capybara at 4fc07dbdc814

UPDATE Jul 26: There have been a few updates to the bug today, with some speculation about the cause of the error. One commenter says that they're seeing the problem with post-redirect-gets, but that's not what I'm experiencing-- I'm getting the problem most with regular old links. The same commenter is suggesting a race condition, so I'm trying adding a sleep 1 after every line in my script that triggers the error-- after the click_link, before the next statement (that capybara automatically is doing a wait/find on). This seemed to help at first, but just kept happening at different places instead. I also tried adding a sleep 1 in capybara's click_link and that did not fix the issue.

BUT! If I add a sleep 1 to be the first line in capybara's selenium find in lib/capybara/selenium/driver.rb, this seems to be enough to get around the race condition (knock on wood). I've run tests that have consistently hit the error a few times now without hitting it, so...

To be clear: The workaround I've had good results with is changing def find in lib/capybara/selenium/driver.rb to be:

   def find(selector)
    sleep 1
     browser.find_elements(:xpath, selector).map { |node| Capybara::Selenium::Node.new(self, node) }
   end

Let me know if that works for you. Obviously this is going to make your tests slower, but since you're running Selenium tests already I'm assuming you aren't Gary Bernhardt and can put up with slow but passing tests for a while.

And this is a workaround, the bug is with Selenium, NOT with capybara. The real solution will be a fix to this bug, so if you are hitting this, go star that issue.

← Newer Page 1 of 3