Raeding a csv file with pandas
pd.read_csv()
or
pd.read_table(‘filrpath’, sep = ‘,’)
Magic Commands
! Is -> print all contect in current working dictory
!pwd -> print current working directory
!cat filename (type filename) -> print out content of the file
Absolute vs. relative path
absolute path -> whole directory + filename
relative path -> current directory + filename
Reading csv files
assign column/ header names
don’t use headers
assign column names and index column
Irregular data files
use whitespace as delimeter
use tabulate as delimeter
skip rows that are not useful
whitespace = ‘\s+’
tabulate = ‘\t’
skiprows = [0,2,3]
Handling missing values
replace the ‘two’ entry in the seomthing column with NaN & replace the ‘foo’ with NaN in the message column
na_values argument in readcsv
Writing data to text format
dataframe -> csv
dataframe -> csv with ‘ I ‘ as seperator
replace null values with NULL
without Index and column
only keep a,b,c column
With no other options specified, both the row and
column labels are written, but they can be disabled by
setting to False
Pandas and Excel
read data from excel
store data in as csv
xlsx = pd.read_excel('ex1.xlsx ', ‘Sheetname’)
data.to_csv()
Zuletzt geändertvor 22 Tagen