How to improve pyparsing start-up time?
I have written a large (to me) grammar using pyparsing to parse H248 message, which is a text format, with about 500 lines of pypasing constructs. I found the average performance is acceptable if I feed the parser with many messages to parse, (about half second per message on my pc).
But the basic usage of my parser is to just parse one single message, and dump out certain part, and I wrote it as a standalone program. Now it took 12s on my x86 pc, (1m12s on a shared solaria server!) to parse the first single message. So I wonder if there is any technique to get around this start up time? I tried psyco, it doesn't help the start up time.
What I do for now is to write a long living server import the parser and accept parse request, then another standalone program (to be triggered very often) to interact with this server process, and thus get a quick response effect. This way, I lose the opportunity to have use parse result as a python object. I wonder if there is better way to achieve my purpose?
BTW, if I manual interrupt my parser before 12s, the stack trace shows it was calling streamline() recursively.
Thanks! Kevin
from pyparsing import *
ALPHA = alphas
DIGIT = nums
HEXDIG = "0123456789ABCDEF"
SafeChar = alphanums + "+-&!_'?@^`~*$\\()%+."
RestChar = ";[]{}:,#<>="
COLON = ":"
DOT = "."
SLASH = "/"
DQUOTE = '"'
SP = " "
HTAB = '\\0x09'
HTAB = ''
EOL = Suppress(lineEnd)
WSP = SP + HTAB
COMMENT = Word(";", SafeChar + RestChar + WSP + '"') + EOL
LWSP = Suppress(Word(WSP))
nonEscapeChar = "}" + srange("[\\0x01-\\0x7c]") #+ srange("[\\0x7E-\\0xFF]")
nonEscapeChar = srange("[\\0x01-\\0x7c]") #+ srange("[\\0x7E-\\0xFF]")
octetString = Word(nonEscapeChar)
quotedString = dblQuotedString
UINT16 = Word(DIGIT, min=1, max=5)
UINT32 = Word(DIGIT, min=1, max=10)
NAME = Word(ALPHA, alphanums + "_", min=1, max=64)
VALUE = quotedString | Word(SafeChar) #| srange("[\\0x80-\\0xFF]")
EQUAL = Suppress("=")
LBRKT = Suppress( "{" )
RBRKT = Suppress( "}" )
COMMA = Suppress(",")
SEP = ((WSP | EOL | COMMENT) + LWSP)
AddToken = Suppress(Literal("Add")| "A")
AndAUDITSelectToken = Suppress(Literal("ANDLgc"))
AuditToken = Suppress(Literal("Audit")| "AT")
AuditCapToken = Suppress(Literal("AuditCapability")| "AC")
AuditValueToken = Suppress(Literal("AuditValue")| "AV")
AuthToken = Suppress(Literal("Authentication")| "AU")
BothToken = Suppress(Literal("Both")| "B")
BothwayToken = Suppress(Literal("Bothway")| "BW")
BriefToken = Suppress(Literal("Brief")| "BR")
BufferToken = Suppress(Literal("Buffer")| "BF")
CtxToken = Suppress(Literal("Context")| "C")
ContextAuditToken = Suppress(Literal("ContextAudit")| "CA")
ContextAttrToken = Suppress(Literal("ContextAttr")| "CT")
ContextListToken = Suppress(Literal("ContextList")| "CLT")
DigitMapToken = Suppress(Literal("DigitMap")| "DM")
DirectionToken = Suppress(Literal("SPADirection")| "SPADI")
DisconnectedToken = Suppress(Literal("Disconnected")| "DC")
DelayToken = Suppress(Literal("Delay")| "DL")
DurationToken = Suppress(Literal("Duration")| "DR")
EmbedToken = Suppress(Literal("Embed")| "EM")
EmergencyToken = Suppress(Literal("Emergency")| "EG")
EmergencyOffToken = Suppress(Literal("EmergencyOff")| "EGO")
EmergencyValueToken = Suppress(Literal("EmergencyValue")| "EGV")
ErrorToken = Suppress(Literal("Error")| "ER")
EventBufferToken = Suppress(Literal("EventBuffer")| "EB")
EventsToken = Suppress(Literal("Events")| "E")
ExternalToken = Suppress(Literal("External")| "EX")
FailoverToken = Suppress(Literal("Failover")| "FL")
ForcedToken = Suppress(Literal("Forced")| "FO")
GracefulToken = Suppress(Literal("Graceful")| "GR")
H221Token = Suppress(Literal("H221"))
H223Token = Suppress(Literal("H223"))
H226Token = Suppress(Literal("H226"))
HandOffToken = Suppress(Literal("HandOff")| "HO")
IEPSToken = Suppress(Literal("IEPSCall")| "IEPS")
ImmAckRequiredToken = Suppress(Literal("ImmAckRequired")| "IA")
InternalToken = Suppress(Literal("Internal")| "IT")
IntsigDelayToken = Suppress(Literal("Intersignal")| "SPAIS")
IsolateToken = Suppress(Literal("Isolate")| "IS")
InSvcToken = Suppress(Literal("InService")| "IV")
InterruptByEventToken = Suppress(Literal("IntByEvent")| "IBE")
InterruptByNewSignalsDescrToken = Suppress(Literal("IntBySigDescr")| "IBS")
IterationToken = Suppress(Literal("Iteration")| "IR")
KeepActiveToken = Suppress(Literal("KeepActive")| "KA")
LocalToken = Suppress(Literal("Local")| "L")
LocalControlToken = Suppress(Literal("LocalControl")| "O")
LockStepToken = Suppress(Literal("LockStep")| "SP")
MediaToken = Suppress(Literal("Media")| "M")
MegacopToken = Suppress(Literal("MEGACO")| "!")
MessageSegmentToken = Suppress(Literal("Segment")| "SM")
MethodToken = Suppress(Literal("Method")| "MT")
MgcIdToken = Suppress(Literal("MgcIdToTry")| "MG")
ModeToken = Suppress(Literal("Mode")| "MO")
ModifyToken = Suppress(Literal("Modify")| "MF")
ModemToken = Suppress(Literal("Modem")| "MD")
MoveToken = Suppress(Literal("Move")| "MV")
MTPToken = Suppress(Literal("MTP"))
MuxToken = Suppress(Literal("Mux")| "MX")
NeverNotifyToken = Suppress(Literal("NeverNotify")| "NBNN")
NotifyToken = Suppress(Literal("Notify")| "N")
NotifyCompletionToken = Suppress(Literal("NotifyCompletion")| "NC")
NotifyImmediateToken = Suppress(Literal("ImmediateNotify")| "NBIN")
NotifyRegulatedToken = Suppress(Literal("RegulatedNotify")| "NBRN")
Nx64kToken = Suppress(Literal("Nx64Kservice")| "N64")
ObservedEventsToken = Suppress(Literal("ObservedEvents")| "OE")
OnewayToken = Suppress(Literal("Oneway")| "OW")
OnewayBothToken = Suppress(Literal("OnewayBoth")| "OWB")
OnewayExternalToken = Suppress(Literal("OnewayExternal")| "OWE")
OnOffToken = Suppress(Literal("OnOff")| "OO")
OrAUDITselectToken = Suppress(Literal("ORLgc"))
OtherReasonToken = Suppress(Literal("OtherReason")| "OR")
OutOfSvcToken = Suppress(Literal("OutOfService")| "OS")
PackagesToken = Suppress(Literal("Packages")| "PG")
#...many literal tokens
TestToken = Suppress(Literal("Test")| "TE")
TimeOutToken = Suppress(Literal("TimeOut")| "TO")
TopologyToken = Suppress(Literal("Topology") | "TP")
TransToken = Suppress(Literal("Transaction") | "T")
ResponseAckToken = Suppress(Literal("TransactionResponseAck")| "K")
V18Token = Suppress(Literal("V18"))
V22Token = Suppress(Literal("V22"))
V22bisToken = Suppress(Literal("V22b"))
V32Token = Suppress(Literal("V32"))
V32bisToken = Suppress(Literal("V32b"))
V34Token = Suppress(Literal("V34"))
V76Token = Suppress(Literal("V76"))
V90Token = Suppress(Literal("V90"))
V91Token = Suppress(Literal("V91"))
VersionToken = Suppress(Literal("Version") | "V")
RSBRKT=Suppress("]")
LSBRKT=Suppress("]")
INEQUAL=Suppress(LWSP+(Literal(">")|"<"|"#")+LWSP)("INEQUAL")
ContextID=Combine((UINT32|"*"|"-"|"$"))("ContextID")
ItemID=NAME("ItemID")
PackageName=NAME("PackageName")
StreamID=UINT16("StreamID")
Time=(Word(DIGIT, exact=8))("Time")
Date=(Word(DIGIT, exact=8))("Date")
RequestID=(UINT32|"*")("RequestID")
digitMapName=NAME("digitMapName")
sigParameterName=NAME("sigParameterName")
signalListId=UINT16("signalListId")
pathDomainName=Group(Word(alphanums+"*",alphanums+"-"+"*"+".",max=64))("pathDomainName")
pathNAME=Combine(Optional("*")+NAME+Word("/"+"*"+alphanums+"_"+"$")+Optional("@"+pathDomainName))("pathNAME")
pkgdName=Combine((PackageName+SLASH+ItemID)|(PackageName+SLASH+"*")|("*"+SLASH+"*"))
alternativeValue=Combine((VALUE|LSBRKT+delimitedList(VALUE, COMMA)+RSBRKT|LBRKT+delimitedList(VALUE, COMMA) +RBRKT|LSBRKT+VALUE+COLON+VALUE+RSBRKT))("alternativeValue")
parmValue=((EQUAL+alternativeValue|INEQUAL+VALUE))("parmValue")
propertyParm=Group(pkgdName+parmValue)("propertyParm")
extensionParameter=Group(Literal("X")+(Literal("-")|"+")+Word(ALPHA+DIGIT,min=1,max=6))("extensionParameter")
contextIdList=Group(ContextListToken+EQUAL+LBRKT+delimitedList(ContextID, COMMA) +RBRKT)("contextIdList")
contextAttrDescriptor=Group(ContextAttrToken+LBRKT+(contextIdList|delimitedList(propertyParm, COMMA))+RBRKT)("contextAttr")
emergencyValue=Group(EmergencyValueToken+EQUAL+(EmergencyToken|EmergencyOffToken))("emergencyValue")
iepsValue=Group(IEPSToken+EQUAL+(Literal("ON")|"OFF"))("iepsValue")
priority=Group(PriorityToken+EQUAL+UINT16)("priority")
topologyDirection=Group(BothwayToken|IsolateToken|OnewayToken|OnewayExternalToken|OnewayBothToken)("topologyDirection")
TerminationID=(Literal("ROOT")|pathNAME|"$"|"*")("TerminationID")
terminationB=(TerminationID)("terminationB")
terminationA=(TerminationID)("terminationA")
eventStream=Group(StreamToken+EQUAL+StreamID)("eventStream")
topologyTriple=Group(terminationA+COMMA+terminationB+COMMA+topologyDirection+Optional(COMMA+eventStream))("topologyTriple")
topologyDescriptor=Group(TopologyToken+LBRKT+delimitedList(topologyTriple, COMMA)+RBRKT)("topology")
statisticsParameter=Group(pkgdName+Optional(EQUAL+VALUE|(LSBRKT+delimitedList(VALUE, COMMA)+RSBRKT)))("statisticsParameter")
statisticsDescriptor=Group(StatsToken+LBRKT+delimitedList(statisticsParameter, COMMA)+RBRKT)("statistics")
TimeStamp=Group(Date+"T"+Time)("TimeStamp")
packagesItem=Group(NAME+"-"+UINT16)("packagesItem")
Version=Group(Word(DIGIT,min=1,max=2))("Version")
packagesDescriptor=Group(PackagesToken+LBRKT+delimitedList(packagesItem, COMMA)+RBRKT)("packages")
extension=Group(extensionParameter+parmValue)("extension")
serviceChangeVersion=Group(VersionToken+EQUAL+Version)("serviceChangeVersion")
serviceChangeProfile=Group(ProfileToken+EQUAL+NAME+SLASH+Version)("serviceChangeProfile")
deviceName=Group(pathNAME)("deviceName")
TransactionID=UINT32("TransactionID")
ErrorCode=Word(DIGIT,min=1,max=4)("ErrorCode")
mtpAddress=Group(MTPToken+LBRKT+Word(HEXDIG,min=4,max=8)+RBRKT)("mtpAddress")
portNumber=Group(UINT16)("portNumber")
hex4=Group(Word(HEXDIG,min=1,max=4))("hex4")
hexseq=Group(hex4+ZeroOrMore(":"+hex4))("hexseq")
hexpart=Group(hexseq+"::"+Optional(hexseq)|"::"+Optional(hexseq)|hexseq)("hexpart")
V4hex=Group(Word(DIGIT,min=1,max=3))("V4hex")
IPv4address=Group(V4hex+DOT+V4hex+DOT+V4hex+DOT+V4hex)("IPv4address")
IPv6address=Group(hexpart+Optional(":"+IPv4address))("IPv6address")
domainAddress=Group(Literal("[")+(IPv4address|IPv6address)+"]")("domainAddress")
domainName=Group("<"+Word(ALPHA+DIGIT,ALPHA+DIGIT+"-"+".",max=64)+">")("domainName")
mId=(((domainAddress|domainName)+Optional(":"+portNumber))|mtpAddress|deviceName)("mId")
sigIntsigDelay=Group(IntsigDelayToken+EQUAL+UINT16)("sigIntsigDelay")
sigRequestID=(RequestIDToken+EQUAL+RequestID)("sigRequestID")
direction=Group(ExternalToken|InternalToken|BothToken)("direction")
sigDirection=Group(DirectionToken+EQUAL+direction)("sigDirection")
sigDuration=Group(DurationToken+EQUAL+UINT16)("sigDuration")
signalType=Group((OnOffToken|TimeOutToken|BriefToken))("signalType")
sigSignalType=Group(SignalTypeToken+EQUAL+signalType)("sigSignalType")
sigOther=Group(sigParameterName+parmValue)("sigOther")
sigStream=Group(StreamToken+EQUAL+StreamID)("sigStream")
signalName=Group(pkgdName)("signalName")
notificationReason=Group(TimeOutToken|InterruptByEventToken|InterruptByNewSignalsDescrToken|OtherReasonToken|IterationToken)("notificationReason")
notifyCompletion=Group(NotifyCompletionToken+EQUAL+(LBRKT+delimitedList(notificationReason, COMMA)+RBRKT))("notifyCompletion")
sigParameter=Group(sigStream|sigSignalType|sigDuration|sigOther|notifyCompletion|KeepActiveToken|sigDirection|sigRequestID|sigIntsigDelay)("sigParameter")
signalRequest=Group(signalName+Optional(LBRKT+delimitedList(sigParameter, COMMA)+RBRKT))("signalRequest")
signalListParm=Group(signalRequest)("signalListParm")
signalList=Group(SignalListToken+EQUAL+signalListId+LBRKT+delimitedList(signalListParm,COMMA)+RBRKT)("signalList")
signalParm=Group(signalList|signalRequest)("signalParm")
signalsDescriptor=Group(SignalsToken+Optional(LBRKT+delimitedList(signalParm,COMMA)+RBRKT))("signals")
serviceChangeMgcId=(MgcIdToken+EQUAL+mId)("serviceChangeMgcId")
serviceChangeAddress=Group(ServiceChangeAddressToken+EQUAL+(mId|portNumber))("serviceChangeAddress")
serviceChangeDelay=Group(DelayToken+EQUAL+UINT32)("serviceChangeDelay")
serviceChangeReason=Group(ReasonToken+EQUAL+VALUE)("serviceChangeReason")
serviceChangeMethod=Group(MethodToken+EQUAL+(FailoverToken|ForcedToken|GracefulToken|RestartToken|DisconnectedToken|HandOffToken|extensionParameter))("serviceChangeMethod")
servChgReplyParm=Group(serviceChangeAddress|serviceChangeMgcId|serviceChangeProfile|serviceChangeVersion|TimeStamp)("servChgReplyParm")
serviceChangeReplyDescriptor=Group(ServicesToken+LBRKT+delimitedList(servChgReplyParm, COMMA)+RBRKT)("serviceChangeReply")
auditReturnItem=Group((MuxToken|ModemToken|MediaToken|DigitMapToken|StatsToken|ObservedEventsToken|PackagesToken))("auditReturnItem")
indAudpackagesDescriptor=Group(PackagesToken+LBRKT+packagesItem+RBRKT)("indAudpackages")
indAudstatisticsDescriptor=Group(StatsToken+LBRKT+pkgdName+RBRKT)("indAudstatistics")
indAuddigitMapDescriptor=Group(DigitMapToken+EQUAL+(digitMapName))("indAuddigitMap")
indAudsignalRequestParm=Group(sigStream|sigRequestID)("indAudsignalRequestParm")
indAudsignalRequest=Group(signalName+Optional(LBRKT+delimitedList(indAudsignalRequestParm, COMMA)+RBRKT))("indAudsignalRequest")
indAudsignalListParm=Group(indAudsignalRequest)("indAudsignalListParm")
indAudsignalList=Group(SignalListToken+EQUAL+signalListId+Optional(LBRKT+indAudsignalListParm+RBRKT))("indAudsignalList")
indAudsignalParm=Group(indAudsignalList|indAudsignalRequest)("indAudsignalParm")
indAudsignalsDescriptor=Group(SignalsToken+LBRKT+Optional(indAudsignalParm)+RBRKT)("indAudsignals")
indAudrequestedEvent=Group(pkgdName)("indAudrequestedEvent")
indAudeventsDescriptor=Group(EventsToken+Optional(EQUAL+RequestID)+LBRKT+indAudrequestedEvent+RBRKT)("indAudevents")
eventParameterName=Group(NAME)("eventParameterName")
eventOther=Group(eventParameterName+parmValue)("eventOther")
indAudeventSpecParameter=Group(eventStream|eventParameterName)("indAudeventSpecParameter")
indAudeventSpec=Group(pkgdName+Optional(LBRKT+indAudeventSpecParameter+RBRKT))("indAudeventSpec")
indAudeventBufferDescriptor=Group(EventBufferToken+LBRKT+indAudeventSpec+RBRKT)("indAudeventBuffer")
serviceStatesValue=Group((TestToken|OutOfSvcToken|InSvcToken))("serviceStatesValue")
indAudterminationStateParm=Group(pkgdName|propertyParm|ServiceStatesToken+Optional((EQUAL|INEQUAL)+serviceStatesValue)|BufferToken)("indAudterminationStateParm")
indAudterminationStateDescriptor=Group(TerminationStateToken+LBRKT+indAudterminationStateParm+RBRKT)("indAudterminationState")
streamModes=(SendonlyToken|RecvonlyToken|SendrecvToken|InactiveToken|LoopbackToken)("Mode")
indAudlocalParm=Group(ModeToken+Optional((EQUAL|INEQUAL)+streamModes)|pkgdName|propertyParm|ReservedValueToken|ReservedGroupToken)("indAudlocalParm")
indAudlocalControlDescriptor=Group(LocalControlToken+LBRKT+delimitedList(indAudlocalParm, COMMA)+RBRKT)("indAudlocalControl")
indAudstreamParm=Forward()
indAudstreamDescriptor=Group(StreamToken+EQUAL+StreamID+LBRKT+indAudstreamParm+RBRKT)("indAudstream")
indAudlocalDescriptor=Group(LocalToken+LBRKT+octetString+RBRKT)("indAudlocal")
indAudremoteDescriptor=Group(RemoteToken+LBRKT+octetString+RBRKT)("indAudremote")
indAudstreamParm<<(indAudlocalControlDescriptor|indAudstatisticsDescriptor|indAudremoteDescriptor|indAudlocalDescriptor)
indAudmediaParm=indAudstreamParm|indAudstreamDescriptor|indAudterminationStateDescriptor
indAudmediaDescriptor=Group(MediaToken+LBRKT+delimitedList(indAudmediaParm, COMMA)+RBRKT)("indAudmedia")
indAudauditReturnParameter=indAudmediaDescriptor|indAudeventsDescriptor|indAudsignalsDescriptor|indAuddigitMapDescriptor|indAudeventBufferDescriptor|indAudstatisticsDescriptor|indAudpackagesDescriptor
indAudterminationAudit=Group(delimitedList(indAudauditReturnParameter, COMMA))("indAudterminationAudit")
auditItem=Group(auditReturnItem|SignalsToken|EventBufferToken|EventsToken|indAudterminationAudit)("auditItem")
serviceChangeParm=Group(serviceChangeMethod|serviceChangeReason|serviceChangeDelay|serviceChangeAddress|serviceChangeProfile|extension|TimeStamp|serviceChangeMgcId|serviceChangeVersion|ServiceChangeIncompleteToken|auditItem)("serviceChangeParm")
serviceChangeDescriptor=Group(ServicesToken+LBRKT+delimitedList(serviceChangeParm,COMMA)+RBRKT)("serviceChange")
digitMapLetter=DIGIT+srange("[\\0x41-\\0x4B]")+srange("[\\0x61-\\0x6B]")+"L"+"S"+"T"+"Z"
digitLetter=Group(Word(DIGIT+"-"+DIGIT)|Word(digitMapLetter))("digitLetter")
digitMapRange=Group((Literal("x")|(LWSP+"["+LWSP+digitLetter+LWSP+"]"+LWSP)))("digitMapRange")
digitPosition=Group(digitMapLetter|digitMapRange)("digitPosition")
digitStringElement=Group(digitPosition+Optional(DOT))("digitStringElement")
digitString=Group(OneOrMore(digitStringElement))("digitString")
digitStringList=Group(digitString+ZeroOrMore(LWSP+"|"+LWSP+digitString))("digitStringList")
digitMap=Group((digitString|LWSP+"("+LWSP+digitStringList+LWSP+")"+LWSP))("digitMap")
Timer=Group(Word(DIGIT,min=1,max=2))("Timer")
digitMapValue=Group(Optional("T"+COLON+Timer+COMMA)+Optional("S"+COLON+Timer+COMMA)+Optional("L"+COLON+Timer+COMMA)+Optional("Z"+COLON+Timer+COMMA)+digitMap)("digitMapValue")
digitMapDescriptor=Group(DigitMapToken+EQUAL+((LBRKT+digitMapValue+RBRKT)| (digitMapName+Optional(LBRKT+digitMapValue+RBRKT))))("digitMap")
modemType=Group(V32bisToken|V22bisToken|V18Token|V22Token|V32Token|V34Token|V90Token|V91Token|SynchISDNToken|extensionParameter)("modemType")
modemDescriptor=Group(ModemToken+((EQUAL+modemType)|(LSBRKT+delimitedList(modemType, COMMA)+RSBRKT))+Optional(LBRKT+delimitedList(propertyParm, COMMA)+RBRKT))("modem")
observedEventParameter=(eventStream|eventOther)
observedEvent=Group(Optional(TimeStamp+LWSP+COLON)+LWSP+pkgdName+Optional(LBRKT+delimitedList(observedEventParameter,COMMA)+RBRKT))("observedEvent")
observedEventsDescriptor=Group(ObservedEventsToken+EQUAL+RequestID+LBRKT+SkipTo(RBRKT)+RBRKT)("observedEvents")
eventDM=Group(DigitMapToken+EQUAL((digitMapName)|(LBRKT+digitMapValue+RBRKT)))("eventDM")
embedSig=Group(EmbedToken+LBRKT+signalsDescriptor+RBRKT)("embedSig")
secondEventParameter=Forward()
secondRequestedEvent=Group(pkgdName+Optional(LBRKT+delimitedList(secondEventParameter, COMMA)+RBRKT))("secondRequestedEvent")
embedFirst=Group(EventsToken+Optional(EQUAL+RequestID+LBRKT+delimitedList(secondRequestedEvent,COMMA)+RBRKT))("embedFirst")
embedNoSig=Group(EmbedToken+LBRKT+embedFirst+RBRKT)("embedNoSig")
embedWithSig=Group(EmbedToken+LBRKT+signalsDescriptor+Optional(COMMA+embedFirst)+RBRKT)("embedWithSig")
notifyRegulated=Group(NotifyRegulatedToken+Optional(LBRKT+(embedWithSig|embedNoSig)+RBRKT))("notifyRegulated")
notifyBehaviour=Group(NotifyImmediateToken|notifyRegulated|NeverNotifyToken)("notifyBehaviour")
eventParameter=Group((embedWithSig|embedNoSig|KeepActiveToken|eventDM|eventStream|eventOther|notifyBehaviour|ResetEventsDescriptorToken))("eventParameter")
secondEventParameter<<(embedSig|KeepActiveToken|eventDM|eventStream|eventOther|notifyBehaviour|ResetEventsDescriptorToken)
requestedEvent=Group(pkgdName+Optional(LBRKT+delimitedList(eventParameter, COMMA)+RBRKT))("requestedEvent")
eventsDescriptor=Group(EventsToken+Optional(EQUAL+RequestID+LBRKT+delimitedList(requestedEvent,COMMA)+RBRKT))("events")
MuxType=Group((H221Token|H223Token|H226Token|V76Token|extensionParameter|Nx64kToken))("MuxType")
serviceStates=Group(ServiceStatesToken+EQUAL+serviceStatesValue)("serviceStates")
eventBufferControlValue=Group(("OFF"|LockStepToken))("eventBufferControlValue")
eventBufferControl=Group(BufferToken+EQUAL+eventBufferControlValue)("eventBufferControl")
terminationStateParm=Group((propertyParm|serviceStates|eventBufferControl))("terminationStateParm")
terminationStateDescriptor=Group(TerminationStateToken+LBRKT+delimitedList(terminationStateParm, COMMA)+RBRKT)("terminationState")
eventSpecParameter=Group((eventStream|eventOther))("eventSpecParameter")
eventSpec=Group(pkgdName+Optional(LBRKT+delimitedList(eventSpecParameter,COMMA)+RBRKT))("eventSpec")
eventBufferDescriptor=Group(EventBufferToken+Optional(LBRKT+delimitedList(eventSpec, COMMA)+RBRKT))("eventBuffer")
remoteDescriptor=(RemoteToken+LBRKT+octetString+RBRKT)("Remote")
localDescriptor=(LocalToken+LBRKT+octetString+RBRKT)("Local")
streamMode=(ModeToken+EQUAL+streamModes)
reservedGroupMode=Group(ReservedGroupToken+EQUAL+(Literal("ON")|"OFF"))("reservedGroupMode")
reservedValueMode=Group(ReservedValueToken+EQUAL+(Literal("ON")|"OFF"))("reservedValueMode")
localParm=(streamMode|propertyParm|reservedValueMode|reservedGroupMode)
localControlDescriptor=Group(LocalControlToken+LBRKT+delimitedList(localParm,COMMA)+RBRKT)("LocalControl")
localControlDescriptor=Group(LocalControlToken+LBRKT+octetString+RBRKT)("LocalControl")
streamParm=(localDescriptor|remoteDescriptor|localControlDescriptor|statisticsDescriptor)
streamDescriptor=Group(StreamToken+EQUAL+StreamID+LBRKT+delimitedList(streamParm, COMMA)+RBRKT)("Stream")
mediaParm=streamParm|streamDescriptor|terminationStateDescriptor
mediaDescriptor=Group(MediaToken+LBRKT+delimitedList(mediaParm, COMMA)+RBRKT)("Media")
terminationIDList=Group(LBRKT+delimitedList(TerminationID, COMMA)+RBRKT)("terminationIDList")
muxDescriptor=Group(MuxToken+EQUAL+MuxType+terminationIDList)("mux")
termIDList=Group((TerminationID|LSBRKT+TerminationID+OneOrMore(COMMA+TerminationID)+RSBRKT))("termIDList")
errorDescriptor=Group(ErrorToken+EQUAL+ErrorCode+LBRKT+Optional(quotedString)+RBRKT)("error")
serviceChangeReply=Group(ServiceChangeToken+EQUAL+termIDList+Optional(LBRKT+(errorDescriptor|serviceChangeReplyDescriptor)+RBRKT))("serviceChangeReply")
serviceChangeRequest=Group(ServiceChangeToken+EQUAL+termIDList+LBRKT+serviceChangeDescriptor+RBRKT)("serviceChangeRequest")
notifyReply=Group(NotifyToken+EQUAL+termIDList+Optional(LBRKT+errorDescriptor+RBRKT))("notifyReply")
notifyRequest=Group(NotifyToken+EQUAL+termIDList+LBRKT+(observedEventsDescriptor+Optional(COMMA+errorDescriptor))+RBRKT)("notifyRequest")
auditDescriptor=Group(AuditToken+LBRKT+Optional(delimitedList(auditItem,COMMA))+RBRKT)("audit")
auditReturnParameter=(mediaDescriptor|modemDescriptor|muxDescriptor|eventsDescriptor|signalsDescriptor|digitMapDescriptor|observedEventsDescriptor|eventBufferDescriptor|statisticsDescriptor|packagesDescriptor|errorDescriptor|auditReturnItem)
contextTerminationAudit=Group(EQUAL+CtxToken+(terminationIDList|LBRKT+errorDescriptor+RBRKT))("contextTerminationAudit")
terminationAudit=Group(delimitedList(auditReturnParameter,COMMA))("terminationAudit")
auditOther=Group(EQUAL+termIDList+Optional(LBRKT+terminationAudit+RBRKT))("auditOther")
auditReply=Group((AuditValueToken|AuditCapToken)+(contextTerminationAudit|auditOther))("auditReply")
auditRequest=Group((AuditValueToken|AuditCapToken)+EQUAL+termIDList+LBRKT+auditDescriptor+RBRKT)("auditRequest")
subtractRequest=Group(SubtractToken+EQUAL+termIDList+Optional(LBRKT+auditDescriptor+RBRKT))("subtractRequest")
ammsReply=Group((AddToken|MoveToken|ModifyToken|SubtractToken)+EQUAL+termIDList+Optional(LBRKT+terminationAudit+RBRKT))("ammsReply")
ammParameter=mediaDescriptor|modemDescriptor|muxDescriptor|eventsDescriptor|signalsDescriptor|digitMapDescriptor|eventBufferDescriptor|auditDescriptor|statisticsDescriptor
ammRequestBody=EQUAL+termIDList+Optional(LBRKT+delimitedList(ammParameter, COMMA)+RBRKT)
ammRequest=Group((MoveToken|ModifyToken)+ ammRequestBody )("ammRequest")
addRequest=Group(AddToken + ammRequestBody)("Add")
segmentNumber=UINT16("segmentNumber")
segmentReply=Group(MessageSegmentToken+EQUAL+TransactionID+SLASH+segmentNumber+Optional(SLASH+SegmentationCompleteToken))("segmentReply")
commandRequest=addRequest| ammRequest | subtractRequest|auditRequest|notifyRequest|serviceChangeRequest
commandRequestList=delimitedList(Optional("O-")+Optional("W-")+commandRequest, COMMA)
AndAUDITselectToken=Group(Literal("TODO"))("AndAUDITselectToken")
OrAUDITselectToken=Group(Literal("TODO"))("OrAUDITselectToken")
auditSelectLogic=Group(Optional(AndAUDITselectToken|OrAUDITselectToken))("auditSelectLogic")
contextAuditSelector=Group(priority|emergencyValue|iepsValue|contextAttrDescriptor|auditSelectLogic)("contextAuditSelector")
contextAuditProperties=Group((TopologyToken|EmergencyToken|PriorityToken|IEPSToken|pkgdName|contextAuditSelector))("contextAuditProperties")
indAudcontextAttrDescriptor=Group(ContextAttrToken+LBRKT+delimitedList(contextAuditProperties, COMMA)+RBRKT)("indAudcontextAttr")
contextAudit=Group(ContextAuditToken+LBRKT+(delimitedList(contextAuditProperties,COMMA))|indAudcontextAttrDescriptor+RBRKT)("contextAudit")
contextProperty=Group((topologyDescriptor|priority|EmergencyToken|EmergencyOffToken|iepsValue|contextAttrDescriptor))("contextProperty")
contextProperties=Group(delimitedList(contextProperty,COMMA))("contextProperties")
commandReplys=(serviceChangeReply|auditReply|ammsReply|notifyReply)
commandReplyList=Group(delimitedList(commandReplys,COMMA))("commandReplyList")
commandReply=Group(((contextProperties+Optional(COMMA+commandReplyList))|commandReplyList))("commandReply")
actionReply=Group(CtxToken+EQUAL+ContextID+Optional(LBRKT+(errorDescriptor|commandReply|(commandReply+COMMA+errorDescriptor))+RBRKT))("actionReply")
actionReplyList=Group(delimitedList(actionReply,COMMA))("actionReplyList")
transactionReply=Group(ReplyToken+EQUAL+TransactionID+Optional(SLASH+segmentNumber+Optional(SLASH+SegmentationCompleteToken))+LBRKT+Optional(ImmAckRequiredToken+COMMA)+(errorDescriptor|actionReplyList)+RBRKT)("TransactionReply")
contextRequest=Group(((contextProperties+Optional(COMMA+contextAudit))|contextAudit))("contextRequest")
actionRequest=Group(CtxToken+EQUAL+ContextID+LBRKT+((contextRequest+Optional(COMMA+commandRequestList))|commandRequestList)+RBRKT)("actionRequest")
transactionRequest=Group(TransToken+EQUAL+TransactionID+LBRKT+delimitedList(actionRequest, COMMA)+RBRKT)("TransactionRequest")
transactionAck=Group(TransactionID|(TransactionID+"-"+Tr开发者_如何学PythonansactionID))("TransactionAck")
transactionResponseAck=Group(ResponseAckToken+LBRKT+transactionAck+ZeroOrMore(COMMA+transactionAck)+RBRKT)("transactionResponseAck")
transactionPending=Group(PendingToken+EQUAL+TransactionID+LBRKT+RBRKT)("transactionPending")
transactionList=OneOrMore(transactionRequest|transactionReply|transactionPending|transactionResponseAck|segmentReply)
messageBody=(transactionList | errorDescriptor)
Message=Group(MegacopToken+SLASH+Version+SkipTo(lineEnd)+messageBody)("Message")
AuthData=Group("0x"+Word(HEXDIG,min=24,max=64))("AuthData")
SequenceNum=Group("0x"+Word(HEXDIG,exact=8))("SequenceNum")
SecurityParmIndex=Group("0x"+Word(HEXDIG,exact=8))("SecurityParmIndex")
authenticationHeader=Group(AuthToken+EQUAL+SecurityParmIndex+COLON+SequenceNum+COLON+AuthData)("authenticationHeader")
megacoMessage=Group(LWSP+Optional(authenticationHeader+SEP)+Message)("megacoMessage").streamline()
Have you tried pickling your parser? I have done this before with a complex parser (for Python itself), and got a reasonable improvement in startup time. Save the pickled parser off to a file, and then to parse a single message, unpickle the parser in from the file instead of explicitly building it. If streamline is an expensive part of the setup, then be sure to call parser.streamline()
before pickling - then when actually parsing a message, this expensive streamline step can be skipped.
You might also be inefficient in constructing your parser if you are not defining low-level primitives, but explicitly redefining the same expressions over and over. For instance, using Word(nums)
repeatedly, instead of defining a low-level expression integer = Word(nums)
and then referring to integer
in places where integers are expected. It is rare for this to have an effect at parser construction time (I have only seen one other person actually have this problem), but it can happen.
Are you using a function to define complex sub-parsers that follow a pattern? Try memoizing that function, so that repeated calls with the same arguments return the same parsing expression instead of new expressions. Then these repeated expressions will only get streamlined once.
Do you have a complex low-level expression that is used many places with different results names? This will actually create copies of that expression under the covers. You could explicitly call streamline
as part of the definition step, and the copies would streamlined ahead of time. I could envision this kind of thing if you have an expression for a real number. In fact, I've seen significant parse-time improvement if you replace a real number definition made up by assembling pyparsing bits (something like Combine(Optional(oneOf('- +')) + Word(nums) + Optional('.' + Word(nums)) + Optional(oneOf('E e') + Optional(oneOf('- +')) + Word(nums))
) with a single Regex expression (Regex(r'[-+]?\d+(\.\d*)?([Ee][-+]?\d+)?')
would correspond to the previous expression). No need to go overboard on this though, pyparsing does a lot of internal representation of parser expressions as regexes.
Well that's all I can come up just on speculation - for more specific suggestions you'll need to post your 500-liner on a pastebin somewhere, so we can see the details of what is happening.
EDIT: One thing you can do is try replacing all of the literal definitions like:
AddToken = Suppress(Literal("Add")| "A")
with
AddToken = oneOf("Add A").suppress()
This will substantially reduce the number of elements in your parser, so much fewer bits to be streamlined or pickled.
This looks like an error:
digitLetter=Group(Word(DIGIT+"-"+DIGIT)|Word(digitMapLetter))("digitLetter")
There is no reason to repeat DIGIT string inside a Word definition.
Don't create Words with whitespace in them, you are very likely to read beyond what you intended. Your use of LWSP also looks improper, since you are trying to do your own whitespace skipping. If you must do this, change Word(WSP)
to White(WSP)
. But really, why are you doing your own whitespace skipping?
Change COMMENT to:
COMMENT = ';' + restOfLine + lineEnd
精彩评论