2nd Week – Buffl

Buffl

LI

von Luca I.

Raeding a csv file with pandas

pd.read_csv()

or

pd.read_table(‘filrpath’, sep = ‘,’)

Magic Commands

! Is -> print all contect in current working dictory
!pwd -> print current working directory
!cat filename (type filename) -> print out content of the file

Absolute vs. relative path

absolute path -> whole directory + filename
relative path -> current directory + filename

Reading csv files

assign column/ header names
don’t use headers
assign column names and index column

Irregular data files

use whitespace as delimeter
use tabulate as delimeter
skip rows that are not useful

whitespace = ‘\s+’
tabulate = ‘\t’

skiprows = [0,2,3]

Handling missing values

replace the ‘two’ entry in the seomthing column with NaN & replace the ‘foo’ with NaN in the message column

na_values argument in readcsv

Writing data to text format

dataframe -> csv
dataframe -> csv with ‘ I ‘ as seperator
replace null values with NULL

Writing data to text format

without Index and column
only keep a,b,c column

With no other options specified, both the row and

column labels are written, but they can be disabled by

setting to False

Pandas and Excel

read data from excel
store data in as csv

xlsx = pd.read_excel('ex1.xlsx ', ‘Sheetname’)
data.to_csv()

Author

Luca I.

Informationen

Zuletzt geändert
vor 3 Monaten

© 2023 Buffl GmbH