how to extract Paragraph text color from ms word using apache poi
i am using apache POI , is it possible to read text background and foreground colo开发者_开发百科r from ms word paragraph
I got the solution
HWPFDocument doc = new HWPFDocument(fs);
WordExtractor we = new WordExtractor(doc);
Range range = doc.getRange();
String[] paragraphs = we.getParagraphText();
for (int i = 0; i < paragraphs.length; i++) {
org.apache.poi.hwpf.usermodel.Paragraph pr = range.getParagraph(i);
System.out.println(pr.getEndOffset());
int j=0;
while (true) {
CharacterRun run = pr.getCharacterRun(j++);
System.out.println("-------------------------------");
System.out.println("Color---"+ run.getColor());
System.out.println("getFontName---"+ run.getFontName());
System.out.println("getFontSize---"+ run.getFontSize());
if( run.getEndOffset()==pr.getEndOffset()){
break;
}
}
}
I found it in :
CharacterRun run = para.getCharacterRun(i)
i
should be integer and should be incremented so the code will be as follow :
int c=0;
while (true) {
CharacterRun run = para.getCharacterRun(c++);
int x = run.getPicOffset();
System.out.println("pic offset" + x);
if (run.getEndOffset() == para.getEndOffset()) {
break;
}
}
if (paragraph != null)
{
int numberOfRuns = paragraph.NumCharacterRuns;
for (int runIndex = 0; runIndex < numberOfRuns; runIndex++)
{
CharacterRun run = paragraph.GetCharacterRun(runIndex);
string color = getColor24(run.GetIco24());
}
}
GetColor24 Function to Convert Color in Hex Format for C#
public static String getColor24(int argbValue)
{
if (argbValue == -1)
return "";
int bgrValue = argbValue & 0x00FFFFFF;
int rgbValue = (bgrValue & 0x0000FF) << 16 | (bgrValue & 0x00FF00)
| (bgrValue & 0xFF0000) >> 16;
StringBuilder result = new StringBuilder("#");
String hex = rgbValue.ToString("X");
for (int i = hex.Length; i < 6; i++)
{
result.Append('0');
}
result.Append(hex);
return result.ToString();
}
if you are working on docx(OOXML), you may want to take a look on this:
import java.io.*
import org.apache.poi.xwpf.usermodel.XWPFDocument
fun test(){
try {
val file = File("file.docx")
val fis = FileInputStream(file.absolutePath)
val document = XWPFDocument(fis)
val paragraphs = document.paragraphs
for (para in paragraphs) {
println("-- ("+para.alignment+") " + para.text)
para.runs.forEach { it ->
println(
"text:" + it.text() + " "
+ "(color:" + it.color
+ ",fontFamily:" + it.fontFamily
+ ")"
)
}
}
fis.close()
} catch (e: Exception) {
e.printStackTrace()
}
}
精彩评论