Retrieve javascript comments in javascript, or, how do I parse js in js?
I am looking for a way to access javascript comments from some (other) javascript code. I plan on using this to display low level help information for elements on the page that call various js function without duplicating that information in multiple places.
mypage.html:
...
<script src="foo.js"></script>
...
<span onclick="foo(bar);">clickme</span>
<span onclick="showhelpfor('foo');>?</span>
...
foo.js:
/**
* This function does foo.
* Call it with bar. Yadda yadda "groo".
*/
function foo(x)开发者_如何学编程
{
...
}
I figure I can use getElementsByTagName to grab the script tag, then load the file with an AJAX request to get the plain text content of it. However, then I'd need a way to parse the javascript in a reliable way (i.e. not a bunch of hacked together regexp's) that preserves the characters that simply eval'ing it would throw away.
I was thinking of simply putting the documentation after the function, in a js string, but that's awkward and I have a feeling getting doxygen to pick that up will be difficult.
function foo(x) { ... }
foo.comment = "\
This functions does foo.\
Call it with bar. Yadda yadda \"groo\".\
";
You could create a little parser that does not parse the complete JS language, but only matches string literals, single- and multi-line comments and functions of course.
There's a JS parser generator called PEG.js that could do this fairly easy. The grammar could look like this:
{
var functions = {};
var buffer = '';
}
start
= unit* {return functions;}
unit
= func
/ string
/ multi_line_comment
/ single_line_comment
/ any_char
func
= m:multi_line_comment spaces? "function" spaces id:identifier {functions[id] = m;}
/ "function" spaces id:identifier {functions[id] = null;}
multi_line_comment
= "/*"
( !{return buffer.match(/\*\//)} c:. {buffer += c;} )*
{
var temp = buffer;
buffer = '';
return "/*" + temp.replace(/\s+/g, ' ');
}
single_line_comment
= "//" [^\r\n]*
identifier
= a:([a-z] / [A-Z] / "_") b:([a-z] / [A-Z] / [0-9] /"_")* {return a + b.join("");}
spaces
= [ \t\r\n]+ {return "";}
string
= "\"" ("\\" . / [^"])* "\""
/ "'" ("\\" . / [^'])* "'"
any_char
= .
When you parse the following source with the generated parser:
/**
* This function does foo.
* Call it with bar. Yadda yadda "groo".
*/
function foo(x)
{
...
}
var s = " /* ... */ function notAFunction() {} ... ";
// function alsoNotAFunction()
// { ... }
function withoutMultiLineComment() {
}
var t = ' /* ... */ function notAFunction() {} ... ';
/**
* BAR!
* Call it?
*/
function doc_way_above(x, y, z) {
...
}
// function done(){};
the start()
function of the parser returns the following map:
{
"foo": "/** * This function does foo. * Call it with bar. Yadda yadda \"groo\". */",
"withoutMultiLineComment": null,
"doc_way_above": "/** * BAR! * Call it? */"
}
I realize there's some gaps to be filled (like this.id = function() { ... }
), but after reading the docs from PEG.js a bit, that shouldn't be a big problem (assuming you know a little of parser generators). If it is a problem, post back and I'll add it to the grammar and explain a bit about what's happening in the grammar.
You can even test the grammar posted above online!
You could use a unique string identifier at the beginning of every comment, and then using that unique identifier you could easily craft a regex to extract the comment.
精彩评论