search

Home  >  Q&A  >  body text

Should python use pickle or csv in this situation?

I have many hundreds of M csv on hand to store some data, and I often need to use pandas and matplotlib to read and plot these data. Before drawing, it is usually necessary to perform preprocessing, slicing and other cleaning operations. Because figures need to be interacted with and reported frequently, I use %matplotlib notebook in jupyter notebook to operate and interact. Should these intermediate data generated from the original data be saved in csv so that the csv can be directly read to obtain the intermediate data for the next display, or should it be saved using pickle, and reading pickle is faster for subsequent use?

给我你的怀抱给我你的怀抱2745 days ago844

reply all(2)I'll reply

  • PHP中文网

    PHP中文网2017-05-18 11:02:47

    CSV must be safe. It seems that changing pickle to another python version may cause reading failure. This is not a universal format. If it is a few hundred megabytes, the csv reading speed is actually not slow. What's more, there is hdf5, these are serious data exchange formats.

    reply
    0
  • 天蓬老师

    天蓬老师2017-05-18 11:02:47

    csv is enough, if you think it’s not fast enough, you can try hdf5 file

    reply
    0
  • Cancelreply