开发者

How much slower are strings containing numbers compared to numbers?

Say I开发者_如何学C want to take a number and return its digits as an array in Ruby.

For this specific purpose or for string functions and number functions in general, which is faster?

These are the algorithms I assume would be most commonly used:

Using Strings: n.to_s.split(//).map {|x| x.to_i}

Using Numbers:

array = []
until n = 0
    m = n % 10
    array.unshift(m)
    n /= 10 
end


The difference seems to be less than one order of magnitude, with the integer-based approach faster for Fixnums. For Bignums, the relative performance starts out more or less even, with the string approach winning out significantly as the number of digits grows.

As strings

Program

#!/usr/bin/env ruby

require 'profile'

$n = 1234567890
10000.times do
    $n.to_s.split(//).map {|x| x.to_i}
end

Output

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 55.64     0.74      0.74    10000     0.07     0.10  Array#map
 21.05     1.02      0.28   100000     0.00     0.00  String#to_i
 10.53     1.16      0.14        1   140.00  1330.00  Integer#times
  7.52     1.26      0.10    10000     0.01     0.01  String#split
  5.26     1.33      0.07    10000     0.01     0.01  Fixnum#to_s
  0.00     1.33      0.00        1     0.00  1330.00  #toplevel

As integers

Program

#!/usr/bin/env ruby

require 'profile'

$n = 1234567890
10000.times do
    array = []
    n = $n
    until n == 0
        m = n%10
        array.unshift(m)
        n /= 10
    end
    array
end

Output

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 70.64     0.77      0.77        1   770.00  1090.00  Integer#times
 29.36     1.09      0.32   100000     0.00     0.00  Array#unshift
  0.00     1.09      0.00        1     0.00  1090.00  #toplevel

Addendum

The pattern seems to hold for smaller numbers also. With $n = 12345, it was around 800ms for the string-based approach and 550ms for the integer-based approach.

When I crossed the boundary into Bignums, say, with $n = 12345678901234567890, I got 2375ms for both approaches. It would appear that the difference evens out nicely, which I would have taken to mean that the internal local powering Bignum is string-like. However, the documentation seems to suggest otherwise.

For academic purposes, I once again doubled the number of digits to $n = 1234567890123456789012345678901234567890. I got around 4450ms for the string approach and 9850ms for the integer approach, a stark reversal that rules out my previous postulate.

Summary

Number of digits | String program | Integer program | Difference
---------------------------------------------------------------------------
5                | 800ms          | 550ms           | Integer wins by 250ms
10               | 1330ms         | 1090ms          | Integer wins by 240ms
20               | 2375ms         | 2375ms          | Tie
40               | 4450ms         | 9850ms          | String wins by 4400ms

How much slower are strings containing numbers compared to numbers?


Steven's response is impressive, but I looked at it for a couple minutes of and couldn't distill it into a simple answer, so here is mine.

For Fixnums

It is fastest to use the digits method I provide below. It's also pretty quick (and much easier) to use num.to_s.each_char.map(&:to_i).


For Bignums

It is fastest to use num.to_s.each_char.map(&:to_i).


The Solution

If speed is honestly the determining factor for what code you use (meaning don't be evil), then this code is the best choice for the job.

class Integer
  def digits
    working_int, digits = self, Array.new
    until working_int.zero?
      digits.unshift working_int % 10
      working_int /= 10
    end
    digits
  end
end

class Bignum
  def digits
    to_s.each_char.map(&:to_i)
  end
end

Here are the approaches I considered to arrive at this conclusion.


I made a solution with 'benchmark' using the code examples of Steven Xu and a String#each_byte-version.

require 'benchmark'
MAX = 10_000

#Solution based on http://stackoverflow.com/questions/6445496/how-much-slower-are-strings-containing-numbers-compared-to-numbers/6447254#6447254
class Integer
  def digits
    working_int, digits = self, Array.new
    until working_int.zero?
      digits.unshift working_int % 10
      working_int /= 10
    end
    digits
  end
end

class Bignum
  def digits
    to_s.each_char.map(&:to_i)
  end
end

[
  12345,
  1234567890,
  12345678901234567890,
  1234567890123456789012345678901234567890,
].each{|num|
puts "========="
puts "Benchmark #{num}"
Benchmark.bm do|b|

   b.report("Integer%        ") do
     MAX.times { 
        array = []
        n = num
        until n == 0
            m = n%10
            array.unshift(m)
            n /= 10
        end
        array     
     }
   end

   b.report("Integer% <<     ") do
     MAX.times { 
        array = []
        n = num
        until n == 0
            m = n%10
            array << m
            n /= 10
        end
        array.reverse
     }
   end

   b.report("Integer#divmod  ") do
     MAX.times { 
        array = []
        n = num
        until n == 0
          n, x = *n.divmod(10)
          array.unshift(x)
        end
        array     
     }
   end

   b.report("Integer#divmod<<") do
     MAX.times { 
        array = []
        n = num
        until n == 0
          n, x = *n.divmod(10)
          array << x
        end
        array.reverse
     }
   end

   b.report("String+split//  ") do
     MAX.times { num.to_s.split(//).map {|x| x.to_i} }
   end

   b.report("String#each_byte") do
     MAX.times { num.to_s.each_byte.map{|x| x.chr } }
   end

   b.report("String#each_char") do
     MAX.times { num.to_s.each_char.map{|x| x.to_i } }
   end

   #http://stackoverflow.com/questions/6445496/how-much-slower-are-strings-containing-numbers-compared-to-numbers/6447254#6447254
   b.report("Num#digit       ") do
     MAX.times { num.to_s.each_char.map{|x| x.to_i } }
   end
 end
}

My results:

Benchmark 12345
      user     system      total        real
Integer%          0.015000   0.000000   0.015000 (  0.015625)
Integer% <<       0.016000   0.000000   0.016000 (  0.015625)
Integer#divmod    0.047000   0.000000   0.047000 (  0.046875)
Integer#divmod<<  0.031000   0.000000   0.031000 (  0.031250)
String+split//    0.109000   0.000000   0.109000 (  0.109375)
String#each_byte  0.047000   0.000000   0.047000 (  0.046875)
String#each_char  0.047000   0.000000   0.047000 (  0.046875)
Num#digit         0.047000   0.000000   0.047000 (  0.046875)
=========
Benchmark 1234567890
      user     system      total        real
Integer%          0.047000   0.000000   0.047000 (  0.046875)
Integer% <<       0.046000   0.000000   0.046000 (  0.046875)
Integer#divmod    0.063000   0.000000   0.063000 (  0.062500)
Integer#divmod<<  0.062000   0.000000   0.062000 (  0.062500)
String+split//    0.188000   0.000000   0.188000 (  0.187500)
String#each_byte  0.063000   0.000000   0.063000 (  0.062500)
String#each_char  0.093000   0.000000   0.093000 (  0.093750)
Num#digit         0.079000   0.000000   0.079000 (  0.078125)
=========
Benchmark 12345678901234567890
      user     system      total        real
Integer%          0.234000   0.000000   0.234000 (  0.234375)
Integer% <<       0.234000   0.000000   0.234000 (  0.234375)
Integer#divmod    0.203000   0.000000   0.203000 (  0.203125)
Integer#divmod<<  0.172000   0.000000   0.172000 (  0.171875)
String+split//    0.266000   0.000000   0.266000 (  0.265625)
String#each_byte  0.125000   0.000000   0.125000 (  0.125000)
String#each_char  0.141000   0.000000   0.141000 (  0.140625)
Num#digit         0.141000   0.000000   0.141000 (  0.140625)
=========
Benchmark 1234567890123456789012345678901234567890
      user     system      total        real
Integer%          0.718000   0.000000   0.718000 (  0.718750)
Integer% <<       0.657000   0.000000   0.657000 (  0.656250)
Integer#divmod    0.562000   0.000000   0.562000 (  0.562500)
Integer#divmod<<  0.485000   0.000000   0.485000 (  0.484375)
String+split//    0.500000   0.000000   0.500000 (  0.500000)
String#each_byte  0.218000   0.000000   0.218000 (  0.218750)
String#each_char  0.282000   0.000000   0.282000 (  0.281250)
Num#digit         0.265000   0.000000   0.265000 (  0.265625)

String#each_byte/each_char is faster the split, for lower numbers the integer version is faster.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜