OpenXML應用-抓取Word中特定表格與句子,轉成String型別(Find special table and paragraph in word,and convert to String type)
本文參考:https://stackoverflow.com/questions/45466325/how-to-read-values-from-a-table-in-a-word-doc-using-c-sharp
還記得之前介紹"OpenXML介紹與應用-保留word特定表格與句子顯示在網頁上(OpenXML keep special(keyword) table and paragraph)",由於微軟的OpenXmlPowerTools品質不穩定,有些word檔就是轉譯失敗,但又是必要的需求,只好動手自己刻。
需求:抓到word中有出現特定關鍵字的表格與句子(且要包含數字),留下來展示在網頁上。
Step1:將你的word檔案打開,用MemoryStream讀,讀到後再用WordprocessingDocument打開她。
using (MemoryStream memoryStream = new MemoryStream())
{
memoryStream.Write((byte[])Session["FileUpload1.FileBytes"], 0,
(int)Session["FileUpload1.FileBytes.Length"]);
using (WordprocessingDocument wDoc = WordprocessingDocument.Open(memoryStream, true))
{
//step2程式在這
}
}
Step2:宣告StringBuilder,用openXml解析word,分辨是句子還是表格,各自處理,配合應用判斷式必須出現關鍵字才留,並加一些<br>讓表格畫出來比較好看。
StringBuilder textBuilder = new StringBuilder();
var parts = wDoc.MainDocumentPart.Document.Descendants().FirstOrDefault();
if (parts != null)
{
foreach (var node in parts.ChildElements)
{
if (node is Paragraph)
{
if (node.InnerText.Contains(keyword) && node.InnerText.Any(char.IsDigit))
{
textBuilder.AppendLine("<br>");
ProcessParagraph((Paragraph)node, textBuilder);
textBuilder.AppendLine("<br>");
}
}
if (node is DocumentFormat.OpenXml.Wordprocessing.Table)
{
if (node.InnerText.Contains(keyword))
{
textBuilder.AppendLine("<br>");
ProcessTable((DocumentFormat.OpenXml.Wordprocessing.Table)node,
textBuilder);
textBuilder.AppendLine("<br>");
}
}
}
}
step3:表格與文字內的細部處理,並加上符號線讓表格像表格XD,最後處理完就可以得到一組StringBuilder
private static void ProcessTable(DocumentFormat.OpenXml.Wordprocessing.Table node, StringBuilder textBuilder)
{
foreach (var row in node.Descendants<DocumentFormat.OpenXml.Wordprocessing.TableRow>())
{
textBuilder.Append("| ");
foreach (var cell in row.Descendants<DocumentFormat.OpenXml.Wordprocessing.TableCell>())
{
foreach (var para in cell.Descendants<Paragraph>())
{
ProcessParagraph(para, textBuilder);
}
textBuilder.Append(" | ");
}
textBuilder.AppendLine("<br>");
}
}
private static void ProcessParagraph(Paragraph node, StringBuilder textBuilder)
{
if (node.InnerText == string.Empty)
{
textBuilder.Append("無");
}
foreach (var text in node.Descendants<Text>())
{
textBuilder.Append(text.InnerText);
}
}
這樣刻出來後畫面大概長這樣XD
介紹到這邊,線上文書檢查甚麼的還真是累人...
還記得之前介紹"OpenXML介紹與應用-保留word特定表格與句子顯示在網頁上(OpenXML keep special(keyword) table and paragraph)",由於微軟的OpenXmlPowerTools品質不穩定,有些word檔就是轉譯失敗,但又是必要的需求,只好動手自己刻。
需求:抓到word中有出現特定關鍵字的表格與句子(且要包含數字),留下來展示在網頁上。
Step1:將你的word檔案打開,用MemoryStream讀,讀到後再用WordprocessingDocument打開她。
using (MemoryStream memoryStream = new MemoryStream())
{
memoryStream.Write((byte[])Session["FileUpload1.FileBytes"], 0,
(int)Session["FileUpload1.FileBytes.Length"]);
using (WordprocessingDocument wDoc = WordprocessingDocument.Open(memoryStream, true))
{
//step2程式在這
}
}
Step2:宣告StringBuilder,用openXml解析word,分辨是句子還是表格,各自處理,配合應用判斷式必須出現關鍵字才留,並加一些<br>讓表格畫出來比較好看。
StringBuilder textBuilder = new StringBuilder();
var parts = wDoc.MainDocumentPart.Document.Descendants().FirstOrDefault();
if (parts != null)
{
foreach (var node in parts.ChildElements)
{
if (node is Paragraph)
{
if (node.InnerText.Contains(keyword) && node.InnerText.Any(char.IsDigit))
{
textBuilder.AppendLine("<br>");
ProcessParagraph((Paragraph)node, textBuilder);
textBuilder.AppendLine("<br>");
}
}
if (node is DocumentFormat.OpenXml.Wordprocessing.Table)
{
if (node.InnerText.Contains(keyword))
{
textBuilder.AppendLine("<br>");
ProcessTable((DocumentFormat.OpenXml.Wordprocessing.Table)node,
textBuilder);
textBuilder.AppendLine("<br>");
}
}
}
}
step3:表格與文字內的細部處理,並加上符號線讓表格像表格XD,最後處理完就可以得到一組StringBuilder
private static void ProcessTable(DocumentFormat.OpenXml.Wordprocessing.Table node, StringBuilder textBuilder)
{
foreach (var row in node.Descendants<DocumentFormat.OpenXml.Wordprocessing.TableRow>())
{
textBuilder.Append("| ");
foreach (var cell in row.Descendants<DocumentFormat.OpenXml.Wordprocessing.TableCell>())
{
foreach (var para in cell.Descendants<Paragraph>())
{
ProcessParagraph(para, textBuilder);
}
textBuilder.Append(" | ");
}
textBuilder.AppendLine("<br>");
}
}
private static void ProcessParagraph(Paragraph node, StringBuilder textBuilder)
{
if (node.InnerText == string.Empty)
{
textBuilder.Append("無");
}
foreach (var text in node.Descendants<Text>())
{
textBuilder.Append(text.InnerText);
}
}
這樣刻出來後畫面大概長這樣XD
介紹到這邊,線上文書檢查甚麼的還真是累人...
好膩害
回覆刪除