Home  >  Article  >  Backend Development  >  ADO.NET reads EXCEL implementation code ((c#))

ADO.NET reads EXCEL implementation code ((c#))

高洛峰
高洛峰Original
2017-01-13 16:57:161443browse

// Connection string

// 连接字符串            
         string xlsPath = Server.MapPath("~/app_data/somefile.xls"); // 绝对物理路径
         string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;" +
                         "Extended Properties=Excel 8.0;" + 
                         "data source=" + xlsPath;
         // 查询语句
         string sql = "SELECT * FROM [Sheet1$]";
         DataSet ds = new DataSet();
         OleDbDataAdapter da = new OleDbDataAdapter(sql, connStr);
         da.Fill(ds);    // 填充DataSet        

         // 在这里对DataSet中的数据进行操作       

         // 输出,绑定数据
         GridView1.DataSource = ds.Tables[0]; 
         GridView1.DataBind();

is very simple, right? ! Everything is just like operating a database, except that you need to pay attention to:
1. The data provider uses Jet, and you need to specify the Extended Properties keyword to set Excel-specific properties. Different versions of Excel correspond to different property values: the valid Excel version for the Extended Properties value.
For Microsoft Excel 8.0 (97), 9.0 (2000), and 10.0 (2002) workbooks, use Excel 8.0.


For Microsoft Excel 5.0 and 7.0 (95) workbooks, use Excel 5.0.

For Microsoft Excel 4.0 workbooks, please use Excel 4.0.

For Microsoft Excel 3.0 workbooks, please use Excel 3.0.

ref:
http://msdn.microsoft.com/library/chs/default.asp?url=/library/CHS/dv_vbcode/html/vbtskcodeexamplereadingexceldataintodataset.asp

2 . The data source path uses the physical absolute path (same as Access)

3. How to reference table name?
A valid reference to a table (or range) in an Excel workbook.
To refer to a range of fully used worksheets, specify the worksheet name followed by a dollar sign. For example:

select * from [Sheet1$]
To reference a specific address range on a worksheet, specify the worksheet name followed by a dollar sign and the range. For example:

select * from [Sheet1$A1:B10]
To reference a specified range, use the name of the range. For example:

select * from [MyNamedRange]
ref:
http://msdn.microsoft.com/library/chs/default.asp?url=/library/CHS/dv_vbcode/html /vbtskcodeexamplereadingexceldataintodataset.asp
Note:
You can reference three objects in the Excel workbook:
• The entire worksheet: [Sheet1$], Sheet1 is the name of the worksheet
• On the worksheet Named cell range: [MyNamedRange] (No need to specify the worksheet, because the named range can only be unique in the entire xls)
XLS naming method: Select the cell range》Insert》Name》Definition
• On the worksheet The unnamed cell range of: [Sheet1$A1:B10]
(Among the various objects provided by relational databases (tables, views, stored procedures, etc.), the Excel data source only provides objects equivalent to tables, which are The specified workbook consists of worksheets and defined named ranges. Named ranges are considered "tables" and worksheets are considered "system tables")

Note:
•Must use [] (square brackets), otherwise it will report:
FROM clause syntax error
•Must be followed by $ (dollar sign), otherwise it will report:
Microsoft Jet database engine cannot find the object 'Sheet2'. Please make sure the object exists and write its name and path correctly.
•If the worksheet name is incorrect or does not exist, it will report:
'Sheet2$' is not a valid name. Make sure it doesn't contain invalid characters or punctuation, and that the name isn't too long.
•In How to use ADO to process Excel data in Visual Basic or VBA, it is mentioned that you can use
~ and ‘(tilde and single quote) instead of [] to use ADO. NET test did not succeed, reporting:
FROM clause syntax error
•When referencing the worksheet name ([Sheet1$]), the data provider believes that the data table starts from the upper-left non-empty unit on the specified worksheet grid starts. For example, if the worksheet starts from row 3, column C, before row 3, column C, and rows 1 and 2 are all empty, only the data starting from row 3, column C will be displayed; the maximum range of the final table will be used. Ends with a non-empty unit within;
•Therefore, if you need to read the range accurately, you should use the named range [NamedRange], or specify the address: [Sheet1$A1:C10]

4. How to quote column names?
•According to the default connection string, the data provider will use the first row in the valid area as the column name. If a cell in this row is empty, it will be represented by F1 and F2, where the ordinal number is followed by the cell name. The positions are consistent, starting from 1;
•If you want the first row to be displayed as data instead of column names, you can specify: HDR=NO in the Extended Properties property of the connection string.
The default value is: HDR=NO. The format is as follows :

string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Extended Properties=\"Excel 8.0;HDR=NO\";" +
"data source= " + xlsPath;
Note: Excel 8.0;HDR=NO requires the use of double quotes (the backslash here is the escape in C#)

ref:
ms-help:// "Connecting to Excel" section in MS.VSCC.v80/MS.MSDN.v80/MS.VisualStudio.v80.chs/WD_ADONET/html/745c5f95-2f02-4674-b378-6d51a7ec2490.htm
(Note: In my own In MSDN, the example uses two double quotes, which is wrong, and the test fails. The original text says:

Note that the double quotes required by Extended Properties must also be added with double quotes ##. #)

In this case, all column names start with F, followed by indexes, starting from F1, F2, F3. . . . . . .

5. Why is the valid cell data not displayed?

The possible reason for this situation is that in the default connection, the data provider infers the data type of subsequent cells based on the previous cells.
You can specify IMEX=1

"IMEX=1;" in Extended Properties to notify the driver to always read the "intermixed" data column as text

ref: Same as 4

PS:在baidu这个问题的时候,有网友说,将每个单元都加上引号,这固然是格方案,但是工作量何其大啊,又不零活,庆幸自己找到”治本药方“

more ref:
如何在 Visual Basic 或 VBA 中使用 ADO 来处理 Excel 数据
http://dotnet.aspx.cc/ShowDetail.aspx?id=C673E2CD-3F1E-4919-8CE0-D69B894A0599


用程序经常需要与Excel进行数据交互,以上阐述了基于ADO.NET
读取Excel的基本方法与技巧。现在要介绍是如何动态的读取Excel数据,这里的动态指的是事先不知道Excel文件的是什么样的结构,或者无法预
测,比如一张.xls文件有多少张sheet,而且每张sheet的结构可能都不一样等等。
其实我们可以通过获取Excel的“架构信息”来动态
的构造查询语句。这里的“架构信息”与数据库领域的“数据库架构信息”意义相同(也称“元数据”),对于整个数据库,这些“元数据”通常包括数据库或可通
过数据库中的数据源、表和视图得到的目录以及所存在的约束等;而对于数据库中的表,架构信息包括主键、列和自动编号字段等。
在上文中提到

在关系数据库提供的各种对象中(表、视图、存储过程等),Excel 数据源仅提供相当于表的对象,它由指定工作簿中的工作表和定义的命名区域组成。命名区域被视为“表”,而工作表被视为“系统表”)

 

这里我们将Excel也当作一个“数据库”来对待,然后利用OleDbConnection.GetOleDbSchemaTable 方法
要获取所需的架构信息,该方法获取的架构信息与ANSI SQl-92是兼容的:


意:对于那些不熟悉 OLE DB 架构行集的人而言,它们基本上是由 ANSI SQL-92
定义的数据库构造的标准化架构。每个架构行集具有为指定构造提供定义元数据的一组列(称作 .NET
文档中的“限制列”)。这样,如果请求架构信息(例如,列的架构信息或排序规则的架构信息),则您会明确知道可以得到哪种类型的数据。如果希望了解更多信
息,请访问 Appendix B:Schema Rowsets。
ref:http://www.microsoft.com/china/msdn/library/office/office/odatanet2.mspx?mfr=true

以下是读取Excel文件内“表”定义元数据,并显示出来的的程序片断:

// 读取Excel数据,填充DataSet
         // 连接字符串            
         string xlsPath = Server.MapPath("~/app_data/somefile.xls");
         string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;" +

 "Extended Properties=\"Excel 8.0;HDR=No;IMEX=1\";" + // 指定扩展属性为
 Microsoft Excel 8.0 (97) 9.0 (2000) 10.0 (2002),并且第一行作为数据返回,且以文本方式读取
                         "data source=" + xlsPath;
         string sql_F = "SELECT * FROM [{0}]";

         OleDbConnection conn = null;
         OleDbDataAdapter da = null;
         DataTable tblSchema = null;
         IList<string> tblNames = null;

         // 初始化连接,并打开
         conn = new OleDbConnection(connStr);
         conn.Open();

         // 获取数据源的表定义元数据                        
         //tblSchema = conn.GetSchema("Tables");
         tblSchema = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });

         GridView1.DataSource = tblSchema;
         GridView1.DataBind();

         // 关闭连接
         conn.Close();
 GetOleDbSchemaTable 方法的详细说明可以参考:
 http://msdn2.microsoft.com/zh-CN/library/system.data.oledb.oledbconnection.getoledbschematable.aspx

 接着是一段利用“架构信息”动态读取Excel内部定义的表单或者命名区域的程序片断:

         // 读取Excel数据,填充DataSet
         // 连接字符串            
         string xlsPath = Server.MapPath("~/app_data/somefile.xls");
         string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;" +

 "Extended Properties=\"Excel 8.0;HDR=No;IMEX=1\";" + // 指定扩展属性为
 Microsoft Excel 8.0 (97) 9.0 (2000) 10.0 (2002),并且第一行作为数据返回,且以文本方式读取
                         "data source=" + xlsPath;
         string sql_F = "SELECT * FROM [{0}]";

         OleDbConnection conn = null;
         OleDbDataAdapter da = null;
         DataTable tblSchema = null;
         IList<string> tblNames = null;

         // 初始化连接,并打开
         conn = new OleDbConnection(connStr);
         conn.Open();

         // 获取数据源的表定义元数据                        
         //tblSchema = conn.GetSchema("Tables");
         tblSchema = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });

         //GridView1.DataSource = tblSchema;
         //GridView1.DataBind();

         // 关闭连接
         //conn.Close();

         tblNames = new List<string>();
         foreach (DataRow row in tblSchema.Rows) {
             tblNames.Add((string)row["TABLE_NAME"]); // 读取表名
         }

         // 初始化适配器
         da = new OleDbDataAdapter();
         // 准备数据,导入DataSet
         DataSet ds = new DataSet();

         foreach (string tblName in tblNames) {
             da.SelectCommand = new OleDbCommand(String.Format(sql_F, tblName), conn);
             try {
                 da.Fill(ds, tblName);
             }
             catch {
                 // 关闭连接
                 if (conn.State == ConnectionState.Open) {
                     conn.Close();
                 }
                 throw;
             }
         }

         // 关闭连接
         if (conn.State == ConnectionState.Open) {
             conn.Close();
         }

         // 对导入DataSet的每张sheet进行处理        
         // 这里仅做显示
         GridView1.DataSource = ds.Tables[0];
         GridView1.DataBind();

         GridView2.DataSource = ds.Tables[1];
         GridView2.DataBind();

这里我们就不需要对SELEC 语句进行“硬编码”,可以根据需要动态的构造FROM 字句的“表名”。


不仅可以,获取表明,还可以获取每张表内的字段名、字段类型等信息:

tblSchema = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, new object[] { null, null, null, null });

在ADO.nET
1.x 时候只有OleDb提供了GetOleDbSchemaTable
方法,而SqlClient或者OrcaleClient没有对应的方法,因为对应数据库已经提供了类似功能的存储过程或者系统表供应用程序访问,比如对
于Sql Server:

SELECT *
FROM Northwind.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = N'Customers'

而在ADO.NET 2.0中每个xxxConnenction都实现了基类System.Data.Common.DbConnection的 GetSchemal 方法
来获取数据源的架构信息。
http://msdn2.microsoft.com/zh-cn/library/system.data.common.dbconnection.getschema.aspx

//---------------------------------------------------------------------------
//IMEX 正確寫法
string connExcel = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filePath +";Extended Properties='Excel 8.0;IMEX=1;'"; //HDR=NO;IMEX=1

using System;
 using System.Data;
 using System.Configuration;
 using System.Web;
 using System.Web.Security;
 using System.Web.UI;
 using System.Web.UI.WebControls;
 using System.Web.UI.WebControls.WebParts;
 using System.Web.UI.HtmlControls;

 using System.Data.SqlClient;
 using System.Data.OleDb;

 public partial class _Default : System.Web.UI.Page 
 {
     protected void Page_Load(object sender, EventArgs e)
     {

     }
     protected void Button1_Click(object sender, EventArgs e)
     {

         string strconn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @"D:/last.xls" + ";Extended Properties=Excel 8.0;"; //HDR=no";//IMEX=1
         OleDbConnection conn = new OleDbConnection(strconn);
         DataSet myset = new DataSet();
         try
         {
             conn.Open();

 
             string mysql = "select * from [Sheet1$] ";//where chs <> &#39;&#39;

             OleDbDataAdapter aper = new OleDbDataAdapter(mysql, conn);

             myset.Tables.Clear();
             aper.Fill(myset, "book");

             conn.Close();

             GridView1.DataSource = myset.Tables["book"];
             GridView1.DataBind();
         }
         catch (Exception ex)
         {
             conn.Close();
             this.lb_msg.Text = ex.Message;
             return;
             // return ex.Message;
         }
     }
 }


更多ADO.NET 读取EXCEL的实现代码((c#))相关文章请关注PHP中文网!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn