Reading CSV files is commonly implemented in Python using pandas library or csv module. 1. Use pandas to read through pd.read_csv(), return DataFrame, supports specifying parameters such as sep, header, index_col, encoding, na_values, etc., suitable for data analysis; 2. Use csv module to read line by line through csv.reader or csv.DictReader, the former returns a list, and the latter returns a dictionary, suitable for lightweight or no dependencies of third-party libraries; 3. Frequently asked questions: Use a complete path to avoid path errors, set encoding='gbk' or 'utf-8' to solve Chinese garbled code, and skiprows parameter skips specific rows. It is recommended to use pandas for daily analysis, and the script can handle optional csv modules, which are complete and flexible in operation.
Reading CSV files is a very common operation in Python. It is usually done using the pandas
library, or the csv
module in the standard library can be used. Here are a few practical examples suitable for different scenarios.

1. Read CSV using pandas (recommended)
import pandas as pd # Read CSV file df = pd.read_csv('data.csv') # Show the first few lines of data print(df.head())
illustrate:
-
pd.read_csv()
is the most commonly used method. - Supports automatic parsing of column names, processing missing values, specifying data types, etc.
- Returns a DataFrame for subsequent data analysis.
Common parameters:

-
sep=','
: Specify the delimiter (default is a comma). -
header=0
: Use the first row as the column name. -
index_col=None
: Do not specify an index column, and a column can also be set as an index. -
encoding='utf-8'
: Specify the encoding, commonly used when processing Chinese. -
na_values=['N/A', '']
: Customize missing value identification.
example:
df = pd.read_csv('data.csv', encoding='utf-8', na_values='NULL')
2. Read using csv module (standard library)
If you don't want to rely on third-party libraries, you can use the built-in csv
module in Python.

import csv with open('data.csv', mode='r', encoding='utf-8') as file: reader = csv.reader(file) for row in reader: print(row) # Each row is a list
If the CSV has headers, you can use DictReader:
import csv with open('data.csv', mode='r', encoding='utf-8') as file: reader = csv.DictReader(file) for row in reader: print(row) # Each row is a dictionary, the key is the column name
3. Handle FAQs
File path error?
Make sure the file is in the current working directory, or use the full path:df = pd.read_csv(r'C:\path\to\your\data.csv')
Chinese garbled?
Try different encodings:pd.read_csv('data.csv', encoding='gbk') # Commonly used in Chinese Windows systems
Skip certain lines?
pd.read_csv('data.csv', skiprows=1) # Skip the first line
Basically these common methods.
pandas
is recommended for daily analysis, which is simple and efficient;csv
modules are available for scripts or lightweight scenarios.The above is the detailed content of python read csv file example. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

To correctly handle JDBC transactions, you must first turn off the automatic commit mode, then perform multiple operations, and finally commit or rollback according to the results; 1. Call conn.setAutoCommit(false) to start the transaction; 2. Execute multiple SQL operations, such as INSERT and UPDATE; 3. Call conn.commit() if all operations are successful, and call conn.rollback() if an exception occurs to ensure data consistency; at the same time, try-with-resources should be used to manage resources, properly handle exceptions and close connections to avoid connection leakage; in addition, it is recommended to use connection pools and set save points to achieve partial rollback, and keep transactions as short as possible to improve performance.

Python is an efficient tool to implement ETL processes. 1. Data extraction: Data can be extracted from databases, APIs, files and other sources through pandas, sqlalchemy, requests and other libraries; 2. Data conversion: Use pandas for cleaning, type conversion, association, aggregation and other operations to ensure data quality and optimize performance; 3. Data loading: Use pandas' to_sql method or cloud platform SDK to write data to the target system, pay attention to writing methods and batch processing; 4. Tool recommendations: Airflow, Dagster, Prefect are used for process scheduling and management, combining log alarms and virtual environments to improve stability and maintainability.

Use classes in the java.time package to replace the old Date and Calendar classes; 2. Get the current date and time through LocalDate, LocalDateTime and LocalTime; 3. Create a specific date and time using the of() method; 4. Use the plus/minus method to immutably increase and decrease the time; 5. Use ZonedDateTime and ZoneId to process the time zone; 6. Format and parse date strings through DateTimeFormatter; 7. Use Instant to be compatible with the old date types when necessary; date processing in modern Java should give priority to using java.timeAPI, which provides clear, immutable and linear

Pre-formanceTartuptimeMoryusage, Quarkusandmicronautleadduetocompile-Timeprocessingandgraalvsupport, Withquarkusoftenperforminglightbetterine ServerLess scenarios.2.Thyvelopecosyste,

Java's garbage collection (GC) is a mechanism that automatically manages memory, which reduces the risk of memory leakage by reclaiming unreachable objects. 1.GC judges the accessibility of the object from the root object (such as stack variables, active threads, static fields, etc.), and unreachable objects are marked as garbage. 2. Based on the mark-clearing algorithm, mark all reachable objects and clear unmarked objects. 3. Adopt a generational collection strategy: the new generation (Eden, S0, S1) frequently executes MinorGC; the elderly performs less but takes longer to perform MajorGC; Metaspace stores class metadata. 4. JVM provides a variety of GC devices: SerialGC is suitable for small applications; ParallelGC improves throughput; CMS reduces

Gradleisthebetterchoiceformostnewprojectsduetoitssuperiorflexibility,performance,andmoderntoolingsupport.1.Gradle’sGroovy/KotlinDSLismoreconciseandexpressivethanMaven’sverboseXML.2.GradleoutperformsMaveninbuildspeedwithincrementalcompilation,buildcac

defer is used to perform specified operations before the function returns, such as cleaning resources; parameters are evaluated immediately when defer, and the functions are executed in the order of last-in-first-out (LIFO); 1. Multiple defers are executed in reverse order of declarations; 2. Commonly used for secure cleaning such as file closing; 3. The named return value can be modified; 4. It will be executed even if panic occurs, suitable for recovery; 5. Avoid abuse of defer in loops to prevent resource leakage; correct use can improve code security and readability.

Choosing the right HTMLinput type can improve data accuracy, enhance user experience, and improve usability. 1. Select the corresponding input types according to the data type, such as text, email, tel, number and date, which can automatically checksum and adapt to the keyboard; 2. Use HTML5 to add new types such as url, color, range and search, which can provide a more intuitive interaction method; 3. Use placeholder and required attributes to improve the efficiency and accuracy of form filling, but it should be noted that placeholder cannot replace label.
