search
HomeBackend DevelopmentC#.Net TutorialThree ways to read Excel in C# and comparative analysis

(1) OleDB method

Advantages: Treat Excel directly as a data source, read the content directly through SQL, and the reading speed is faster.

Disadvantages: The method of reading data is not flexible enough and cannot directly read a certain cell. Only after reading the entire Sheet page (the result is Datatable) can the specified value be obtained in the Datatable based on the number of rows and columns. .

When the amount of Excel data is large. It will occupy a lot of memory, and when there is not enough memory, a memory overflow exception will be thrown.

The reading code is as follows:

 
public DataTable GetExcelTableByOleDB(string strExcelPath, string tableName)
{
  try
  {
    DataTable dtExcel = new DataTable();
    //数据表
    DataSet ds = new DataSet();
    //获取文件扩展名
    string strExtension = System.IO.Path.GetExtension(strExcelPath);
    string strFileName = System.IO.Path.GetFileName(strExcelPath);
    //Excel的连接
    OleDbConnection objConn = null;
    switch (strExtension)
    {
      case ".xls":
        objConn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strExcelPath + ";" + "Extended Properties=\"Excel 8.0;HDR=NO;IMEX=1;\"");
        break;
      case ".xlsx":
        objConn = new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + strExcelPath + ";" + "Extended Properties=\"Excel 12.0;HDR=NO;IMEX=1;\"");
        break;
      default:
        objConn = null;
        break;
    }
    if (objConn == null)
    {
      return null;
    }
    objConn.Open();
    //获取Excel中所有Sheet表的信息
    //System.Data.DataTable schemaTable = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, null);
    //获取Excel的第一个Sheet表名
    //string tableName = schemaTable.Rows[0][2].ToString().Trim();
    string strSql = "select * from [" + tableName + "]";
    //获取Excel指定Sheet表中的信息
    OleDbCommand objCmd = new OleDbCommand(strSql, objConn);
    OleDbDataAdapter myData = new OleDbDataAdapter(strSql, objConn);
    myData.Fill(ds, tableName);//填充数据
    objConn.Close();
    //dtExcel即为excel文件中指定表中存储的信息
    dtExcel = ds.Tables[tableName];
    return dtExcel;
  }
  catch
  {
    return null;
  }
}

The following explains the connection string

HDR=Yes, which represents the first Rows are titles and are not used as data (but in my actual use, if there are complex values ​​in the first row, the read Datatable column titles will automatically be set to F1, F2, etc., which is inconsistent with actual applications, so At that time, all the content was read into the Datatable through HDR=No, and then the first row was manually set as the title); IMEX (IMport EXport mode) setting
IMEX has three modes:
0 is Export mode
1 is Import mode
2 is Linked mode (full update capabilities)
What I particularly want to explain here is the IMEX parameter, because different modes represent different reading and writing behaviors:
When IMEX =0 is the "export mode". The Excel file opened in this mode can only be used for "writing" purposes.
When IMEX=1, it is "Import mode". The Excel file opened in this mode can only be used for "reading" purposes.
When IMEX=2, it is "link mode". The Excel file opened in this mode can support both "reading" and "writing" purposes.

--------------------------------

In addition, read Excel2007 version of the file, the version should be changed from 8.0 to 12.0. At the same time, the driver can no longer use Jet, but should use ACE. Responsible for causing "No installable ISAM found" error.

--------------------------------

Also found on the Internet using In this method, the number of Sheet tables retrieved may be greater than the number of Sheet tables in the actual Excel table. There are two reasons for this:

1. The extracted names include those in the XL Naming Manager name (see the formula of XL2007 - Naming Manager, shortcut key Crtl+F3);

2. The extracted name includes the FilterDatabase suffix, which is used by XL to record the Filter range.

The first point is relatively simple, just delete the content in the existing naming manager; the second point is more troublesome to deal with. These names will still be retained after the Filter is deleted. The simple way is to add a new Sheet and then Copy the original Sheet into it. However, the actual situation cannot do the above checks for every Excel. The filtering scheme is given below. (We have verified this problem, please verify it yourself)

//objConn为读取Excel的链接,下面通过过滤来获取有效的Sheet页名称集合
 System.Data.DataTable schemaTable = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, null);
 List<string> lstSheetNames = new List<string>();
 for (int i = 0; i < schemaTable.Rows.Count; i++)
 {
   string strSheetName = (string)dtSheetName.Rows[i]["TABLE_NAME"];
   if (strSheetName.Contains("$") && !strSheetName.Replace("&#39;", "").EndsWith("$"))
   {
     //过滤无效SheetName完毕....
     continue;
   }
   if (lstSheetNames != null && !lstSheetNames.Contains(strSheetName))
     lstSheetNames.Add(strSheetName);
 }

Because the read out SheetName is invalid and the last character is generally not $. If the SheetName has some special symbols, the read SheetName will be automatically added with single quotes. For example, if you edit SheetName to MySheet(1) in Excel, the read SheetName will be: 'MySheet(1)$', so it is best to filter the single quotes before judging whether the last character is $.

--------------------------------

(2) Com component method (implemented by adding Microsoft.Office.Interop.Excel reference)

Advantages: It can read data in Excel very flexibly, and users can flexibly call various functions for processing.

Disadvantages: Cell-based processing has a slow reading speed. It is best not to use this method to read files with large amounts of data.

You need to add the corresponding DLL reference. This reference must exist before it can be used. If the Web site is deployed on IIS, Excel needs to be installed on the server machine, and sometimes IIS permissions need to be configured.

The reading code is as follows:

private Stopwatch wath = new Stopwatch();
/// <summary>
/// 使用COM读取Excel
/// </summary>
/// <param name="excelFilePath">路径</param>
/// <returns>DataTabel</returns>
public System.Data.DataTable GetExcelData(string excelFilePath)
{
  Excel.Application app = new Excel.Application();
  Excel.Sheets sheets;
  Excel.Workbook workbook = null;
  object oMissiong = System.Reflection.Missing.Value;
  System.Data.DataTable dt = new System.Data.DataTable();
  wath.Start();
  try
  {
    if (app == null)
    {
      return null;
    }
    workbook = app.Workbooks.Open(excelFilePath, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong,
      oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong);
    //将数据读入到DataTable中——Start 
    sheets = workbook.Worksheets;
    Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1);//读取第一张表
    if (worksheet == null)
      return null;
    string cellContent;
    int iRowCount = worksheet.UsedRange.Rows.Count;
    int iColCount = worksheet.UsedRange.Columns.Count;
    Excel.Range range;
    //负责列头Start
    DataColumn dc;
    int ColumnID = 1;
    range = (Excel.Range)worksheet.Cells[1, 1];
    while (range.Text.ToString().Trim() != "")
    {
      dc = new DataColumn();
      dc.DataType = System.Type.GetType("System.String");
      dc.ColumnName = range.Text.ToString().Trim();
      dt.Columns.Add(dc);
 
      range = (Excel.Range)worksheet.Cells[1, ++ColumnID];
    }
    //End
    for (int iRow = 2; iRow <= iRowCount; iRow++)
    {
      DataRow dr = dt.NewRow();
      for (int iCol = 1; iCol <= iColCount; iCol++)
      {
        range = (Excel.Range)worksheet.Cells[iRow, iCol];
        cellContent = (range.Value2 == null) ? "" : range.Text.ToString();
          dr[iCol - 1] = cellContent;
      }
      dt.Rows.Add(dr);
    }
    wath.Stop();
    TimeSpan ts = wath.Elapsed;
    //将数据读入到DataTable中——End
    return dt;
  }
  catch
  {
    return null;
  }
  finally
  {
    workbook.Close(false, oMissiong, oMissiong);
    System.Runtime.InteropServices.Marshal.ReleaseComObject(workbook);
    workbook = null;
    app.Workbooks.Close();
    app.Quit();
    System.Runtime.InteropServices.Marshal.ReleaseComObject(app);
    app = null;
    GC.Collect();
    GC.WaitForPendingFinalizers();
  }
}
/// <summary>
/// 使用COM,多线程读取Excel(1 主线程、4 副线程)
/// </summary>
/// <param name="excelFilePath">路径</param>
/// <returns>DataTabel</returns>
public System.Data.DataTable ThreadReadExcel(string excelFilePath)
{
  Excel.Application app = new Excel.Application();
  Excel.Sheets sheets = null;
  Excel.Workbook workbook = null;
  object oMissiong = System.Reflection.Missing.Value;
  System.Data.DataTable dt = new System.Data.DataTable();
  wath.Start();
  try
  {
    if (app == null)
    {
      return null;
    }
    workbook = app.Workbooks.Open(excelFilePath, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong,
      oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong, oMissiong);
    //将数据读入到DataTable中——Start 
    sheets = workbook.Worksheets;
    Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1);//读取第一张表
    if (worksheet == null)
      return null;
    string cellContent;
    int iRowCount = worksheet.UsedRange.Rows.Count;
    int iColCount = worksheet.UsedRange.Columns.Count;
    Excel.Range range;
    //负责列头Start
    DataColumn dc;
    int ColumnID = 1;
    range = (Excel.Range)worksheet.Cells[1, 1];
    while (iColCount >= ColumnID)
    {
      dc = new DataColumn();
      dc.DataType = System.Type.GetType("System.String");
      string strNewColumnName = range.Text.ToString().Trim();
      if (strNewColumnName.Length == 0) strNewColumnName = "_1";
      //判断列名是否重复
      for (int i = 1; i < ColumnID; i++)
      {
        if (dt.Columns[i - 1].ColumnName == strNewColumnName)
          strNewColumnName = strNewColumnName + "_1";
      }
      dc.ColumnName = strNewColumnName;
      dt.Columns.Add(dc);
      range = (Excel.Range)worksheet.Cells[1, ++ColumnID];
    }
    //End
    //数据大于500条,使用多进程进行读取数据
    if (iRowCount - 1 > 500)
    {
      //开始多线程读取数据
      //新建线程
      int b2 = (iRowCount - 1) / 10;
      DataTable dt1 = new DataTable("dt1");
      dt1 = dt.Clone();
      SheetOptions sheet1thread = new SheetOptions(worksheet, iColCount, 2, b2 + 1, dt1);
      Thread othread1 = new Thread(new ThreadStart(sheet1thread.SheetToDataTable));
      othread1.Start();
      //阻塞 1 毫秒,保证第一个读取 dt1
      Thread.Sleep(1);
      DataTable dt2 = new DataTable("dt2");
      dt2 = dt.Clone();
      SheetOptions sheet2thread = new SheetOptions(worksheet, iColCount, b2 + 2, b2 * 2 + 1, dt2);
      Thread othread2 = new Thread(new ThreadStart(sheet2thread.SheetToDataTable));
      othread2.Start();
      DataTable dt3 = new DataTable("dt3");
      dt3 = dt.Clone();
      SheetOptions sheet3thread = new SheetOptions(worksheet, iColCount, b2 * 2 + 2, b2 * 3 + 1, dt3);
      Thread othread3 = new Thread(new ThreadStart(sheet3thread.SheetToDataTable));
      othread3.Start();
      DataTable dt4 = new DataTable("dt4");
      dt4 = dt.Clone();
      SheetOptions sheet4thread = new SheetOptions(worksheet, iColCount, b2 * 3 + 2, b2 * 4 + 1, dt4);
      Thread othread4 = new Thread(new ThreadStart(sheet4thread.SheetToDataTable));
      othread4.Start();
      //主线程读取剩余数据
      for (int iRow = b2 * 4 + 2; iRow <= iRowCount; iRow++)
      {
        DataRow dr = dt.NewRow();
        for (int iCol = 1; iCol <= iColCount; iCol++)
        {
          range = (Excel.Range)worksheet.Cells[iRow, iCol];
          cellContent = (range.Value2 == null) ? "" : range.Text.ToString();
          dr[iCol - 1] = cellContent;
        }
        dt.Rows.Add(dr);
      }
      othread1.Join();
      othread2.Join();
      othread3.Join();
      othread4.Join();
      //将多个线程读取出来的数据追加至 dt1 后面
      foreach (DataRow dr in dt.Rows)
        dt1.Rows.Add(dr.ItemArray);
      dt.Clear();
      dt.Dispose();
      foreach (DataRow dr in dt2.Rows)
        dt1.Rows.Add(dr.ItemArray);
      dt2.Clear();
      dt2.Dispose();
      foreach (DataRow dr in dt3.Rows)
        dt1.Rows.Add(dr.ItemArray);
      dt3.Clear();
      dt3.Dispose();
      foreach (DataRow dr in dt4.Rows)
        dt1.Rows.Add(dr.ItemArray);
      dt4.Clear();
      dt4.Dispose();
      return dt1;
    }
    else
    {
      for (int iRow = 2; iRow <= iRowCount; iRow++)
      {
        DataRow dr = dt.NewRow();
        for (int iCol = 1; iCol <= iColCount; iCol++)
        {
          range = (Excel.Range)worksheet.Cells[iRow, iCol];
          cellContent = (range.Value2 == null) ? "" : range.Text.ToString();
          dr[iCol - 1] = cellContent;
        }
        dt.Rows.Add(dr);
      }
    }
    wath.Stop();
    TimeSpan ts = wath.Elapsed;
    //将数据读入到DataTable中——End
    return dt;
  }
  catch
  {
    return null;
  }
  finally
  {
    workbook.Close(false, oMissiong, oMissiong);
    System.Runtime.InteropServices.Marshal.ReleaseComObject(workbook);
    System.Runtime.InteropServices.Marshal.ReleaseComObject(sheets);
    workbook = null;
    app.Workbooks.Close();
    app.Quit();
    System.Runtime.InteropServices.Marshal.ReleaseComObject(app);
    app = null;
    GC.Collect();
    GC.WaitForPendingFinalizers();
  }
}

(3) NPOI method to read Excel (this method has not been tested)

NPOI is the .NET version of the POI project. POI is an open source Java project for reading and writing Excel, WORD and other Microsoft OLE2 component documents. Using NPOI, you can read and write WORD/EXCEL documents on a machine that does not have Office installed or a corresponding environment.

Advantages: Reading Excel is faster, and the reading method is flexible.

Disadvantages: The corresponding plug-in needs to be downloaded and added to the system reference.

/// <summary>
/// 将excel中的数据导入到DataTable中
/// </summary>
/// <param name="sheetName">excel工作薄sheet的名称</param>
/// <param name="isFirstRowColumn">第一行是否是DataTable的列名</param>
/// <returns>返回的DataTable</returns>
public DataTable ExcelToDataTable(string sheetName, bool isFirstRowColumn)
{
  ISheet sheet = null;
  DataTable data = new DataTable();
  int startRow = 0;
  try
  {
    fs = new FileStream(fileName, FileMode.Open, FileAccess.Read);
    if (fileName.IndexOf(".xlsx") > 0) // 2007版本
      workbook = new XSSFWorkbook(fs);
    else if (fileName.IndexOf(".xls") > 0) // 2003版本
      workbook = new HSSFWorkbook(fs);
    if (sheetName != null)
    {
      sheet = workbook.GetSheet(sheetName);
    }
    else
    {
      sheet = workbook.GetSheetAt(0);
    }
    if (sheet != null)
    {
      IRow firstRow = sheet.GetRow(0);
      int cellCount = firstRow.LastCellNum; //一行最后一个cell的编号 即总的列数
      if (isFirstRowColumn)
      {
        for (int i = firstRow.FirstCellNum; i < cellCount; ++i)
        {
          DataColumn column = new DataColumn(firstRow.GetCell(i).StringCellValue);
          data.Columns.Add(column);
        }
        startRow = sheet.FirstRowNum + 1;
      }
      else
      {
        startRow = sheet.FirstRowNum;
      }
      //最后一列的标号
      int rowCount = sheet.LastRowNum;
      for (int i = startRow; i <= rowCount; ++i)
      {
        IRow row = sheet.GetRow(i);
        if (row == null) continue; //没有数据的行默认是null       
         
        DataRow dataRow = data.NewRow();
        for (int j = row.FirstCellNum; j < cellCount; ++j)
        {
          if (row.GetCell(j) != null) //同理,没有数据的单元格都默认是null
            dataRow[j] = row.GetCell(j).ToString();
        }
        data.Rows.Add(dataRow);
      }
    }
    return data;
  }
  catch (Exception ex)
  {
    Console.WriteLine("Exception: " + ex.Message);
    return null;
  }
}

For more C# three ways to read Excel and comparative analysis related articles, please pay attention to the PHP Chinese website!


Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Mastering C# .NET Design Patterns: From Singleton to Dependency InjectionMastering C# .NET Design Patterns: From Singleton to Dependency InjectionMay 09, 2025 am 12:15 AM

Design patterns in C#.NET include Singleton patterns and dependency injection. 1.Singleton mode ensures that there is only one instance of the class, which is suitable for scenarios where global access points are required, but attention should be paid to thread safety and abuse issues. 2. Dependency injection improves code flexibility and testability by injecting dependencies. It is often used for constructor injection, but it is necessary to avoid excessive use to increase complexity.

C# .NET in the Modern World: Applications and IndustriesC# .NET in the Modern World: Applications and IndustriesMay 08, 2025 am 12:08 AM

C#.NET is widely used in the modern world in the fields of game development, financial services, the Internet of Things and cloud computing. 1) In game development, use C# to program through the Unity engine. 2) In the field of financial services, C#.NET is used to develop high-performance trading systems and data analysis tools. 3) In terms of IoT and cloud computing, C#.NET provides support through Azure services to develop device control logic and data processing.

C# .NET Framework vs. .NET Core/5/6: What's the Difference?C# .NET Framework vs. .NET Core/5/6: What's the Difference?May 07, 2025 am 12:06 AM

.NETFrameworkisWindows-centric,while.NETCore/5/6supportscross-platformdevelopment.1).NETFramework,since2002,isidealforWindowsapplicationsbutlimitedincross-platformcapabilities.2).NETCore,from2016,anditsevolutions(.NET5/6)offerbetterperformance,cross-

The Community of C# .NET Developers: Resources and SupportThe Community of C# .NET Developers: Resources and SupportMay 06, 2025 am 12:11 AM

The C#.NET developer community provides rich resources and support, including: 1. Microsoft's official documents, 2. Community forums such as StackOverflow and Reddit, and 3. Open source projects on GitHub. These resources help developers improve their programming skills from basic learning to advanced applications.

The C# .NET Advantage: Features, Benefits, and Use CasesThe C# .NET Advantage: Features, Benefits, and Use CasesMay 05, 2025 am 12:01 AM

The advantages of C#.NET include: 1) Language features, such as asynchronous programming simplifies development; 2) Performance and reliability, improving efficiency through JIT compilation and garbage collection mechanisms; 3) Cross-platform support, .NETCore expands application scenarios; 4) A wide range of practical applications, with outstanding performance from the Web to desktop and game development.

Is C# Always Associated with .NET? Exploring AlternativesIs C# Always Associated with .NET? Exploring AlternativesMay 04, 2025 am 12:06 AM

C# is not always tied to .NET. 1) C# can run in the Mono runtime environment and is suitable for Linux and macOS. 2) In the Unity game engine, C# is used for scripting and does not rely on the .NET framework. 3) C# can also be used for embedded system development, such as .NETMicroFramework.

The .NET Ecosystem: C#'s Role and BeyondThe .NET Ecosystem: C#'s Role and BeyondMay 03, 2025 am 12:04 AM

C# plays a core role in the .NET ecosystem and is the preferred language for developers. 1) C# provides efficient and easy-to-use programming methods, combining the advantages of C, C and Java. 2) Execute through .NET runtime (CLR) to ensure efficient cross-platform operation. 3) C# supports basic to advanced usage, such as LINQ and asynchronous programming. 4) Optimization and best practices include using StringBuilder and asynchronous programming to improve performance and maintainability.

C# as a .NET Language: The Foundation of the EcosystemC# as a .NET Language: The Foundation of the EcosystemMay 02, 2025 am 12:01 AM

C# is a programming language released by Microsoft in 2000, aiming to combine the power of C and the simplicity of Java. 1.C# is a type-safe, object-oriented programming language that supports encapsulation, inheritance and polymorphism. 2. The compilation process of C# converts the code into an intermediate language (IL), and then compiles it into machine code execution in the .NET runtime environment (CLR). 3. The basic usage of C# includes variable declarations, control flows and function definitions, while advanced usages cover asynchronous programming, LINQ and delegates, etc. 4. Common errors include type mismatch and null reference exceptions, which can be debugged through debugger, exception handling and logging. 5. Performance optimization suggestions include the use of LINQ, asynchronous programming, and improving code readability.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.