开发者

How do I determine that an instance of org.apache.poi.hwpf.model.ListData belongs to a numbered list or bulleted list?

Is there a way to determine if an instance of a org.apache.poi.hwpf.model.ListData belongs to a numbered list or bulleted list?

I am using Apache Poi's org.apache.poi.hwpf.HWPFDocument class to read the contents of a word document in order to generate HTML. I can identify the list items in the document by checking to see that the paragraph I am working with is an instance of org.apache.poi.hwpf.model.ListData. I can not find a way to determine if ListData belongs to a bullet开发者_如何学Pythoned list or a numbered list.


I think I have found the answer to my own question.

ListEntry aListEntry = (ListEntry) aParagraph;
ListData listData = listTables.getListData(aListEntry.getIlfo());
int numberFormat = listData.getLevel(listData.numLevels()).getNumberFormat();

The number format returns 23 for bullet points and 0 for numbered lists. I dare say that there are multiple format numbers that can be interpreted as either bullet points or numbered lists but at least I can now identify them!


I lately posted another way to determine the list type. Unfortunately this way only worked for a few tests.

I now can confirm leighgorys way to determine the list type.


public class ListTest {

public static void main(String[] args) {

    String filename = "/some/path/to/ListTest.doc";

    try {

        POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(filename));
        HWPFDocument doc = new HWPFDocument(fs);
        //Get a table of all the lists in this document
        ListTables listtables = doc.getListTables();
        Paragraph para;

        Range range = doc.getRange();
        for(int x=0; x<range.numParagraphs(); x++) {
            para = range.getParagraph(x);

           //When non-zero, (1-based) index into the pllfo
           //identifying the list to which the paragraph belongs
           if( para.getIlfo()!=0 ) {
                //Get the list this paragraph belongs to
                ListData listdata = listtables.getListData(para.getIlfo());
                //Now get all the levels for this list
                ListLevel[] listlevel = listdata.getLevels();
                //Find the list level info for our paragraph
                ListLevel level = listlevel[para.getIlvl()];
                System.out.print("Text: \"" + para.text() + "\"");
                //list level for this paragraph
                System.out.print("\tListLevel: " + para.getIlvl());
                //Additional text associated with list symbols
                System.out.print("\tgetNumberText: \"" + level.getNumberText() + "\"" );
                //Format value for the style of list symbols
                System.out.println("\tgetNumberFormat: " + level.getNumberFormat() );
            } else {
                System.out.println();
            }
        }
    } catch(Exception e) {
        e.printStackTrace();
    }
 }
}

nfc value Numbering scheme

15 Single Byte character

16 Kanji numbering 3 (dbnum3).

17 Kanji numbering 4 (dbnum4).

18 Circle numbering (circlenum).

19 Double-byte Arabic numbering

20 46 phonetic double-byte Katakana characters (aiueodbchar).

21 46 phonetic double-byte katakana characters (irohadbchar).

22 Arabic with leading zero (01, 02, 03, ..., 10, 11)

23 Bullet (no number at all)

24 Korean numbering 2 (ganada).

25 Korean numbering 1 (chosung).

26 Chinese numbering 1 (gb1).

27 Chinese numbering 2 (gb2).

28 Chinese numbering 3 (gb3).

29 Chinese numbering 4 (gb4).

30 Chinese Zodiac numbering 1

31 Chinese Zodiac numbering 2

32 Chinese Zodiac numbering 3

33 Taiwanese double-byte numbering 1

34 Taiwanese double-byte numbering 2

35 Taiwanese double-byte numbering 3

36 Taiwanese double-byte numbering 4

37 Chinese double-byte numbering 1

38 Chinese double-byte numbering 2

39 Chinese double-byte numbering 3

40 Chinese double-byte numbering 4

41 Korean double-byte numbering 1

42 Korean double-byte numbering 2

43 Korean double-byte numbering 3

44 Korean double-byte numbering 4

45 Hebrew non-standard decimal

46 Arabic Alif Ba Tah

47 Hebrew Biblical standard

48 Arabic Abjad style

49 Hindi vowels

50 Hindi consonants

51 Hindi numbers

52 Hindi descriptive (cardinals)

53 Thai letters

54 Thai numbers

55 Thai descriptive (cardinals

56 Vietnamese descriptive (cardinals)

57 Page Number format - # -

58 Lower case Russian alphabet

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜