Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Other Discussions
  3. The Insider News
  4. Useful Unix commands for data science

Useful Unix commands for data science

Scheduled Pinned Locked Moved The Insider News
databasepythonperlrubycom
1 Posts 1 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T Offline
    T Offline
    Terrence Dorsey
    wrote on last edited by
    #1

    Greg Reda[^]:

    Imagine you have a 4.2GB CSV file. It has over 12 million records and 50 columns. All you need from this file is the sum of all values in one particular column. How would you do it? Writing a script in python/ruby/perl/whatever would probably take a few minutes and then even more time for the script to actually complete. A database and SQL would be fairly quick, but then you'd have load the data, which is kind of a pain. Thankfully, the Unix utilities exist and they're awesome.

    This sounds like a job for VB...

    1 Reply Last reply
    0
    Reply
    • Reply as topic
    Log in to reply
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes


    • Login

    • Don't have an account? Register

    • Login or register to search.
    • First post
      Last post
    0
    • Categories
    • Recent
    • Tags
    • Popular
    • World
    • Users
    • Groups