MediaWiki 1.16.0 - In phpmyadmin select the current articles
I'm trying to get all articles that are the current/latest articles in Mediawiki 1.16.0. I need to do this in ph开发者_高级运维pMyadmin and make a dump from those results.
my SQL:
SELECT
page.page_title, page.page_latest
, revision.rev_id, revision.rev_text_id
, text.old_id, text.old_text
FROM page, revision, text
WHERE rev_id = page_latest AND rev_text_id = old_id
I get the image names also but not a problem. I feel that this SQL above is not getting the latest version of the articles.
If there is a way to not show image names and redirects in the results it would also help.
First of all please don't use that ugly implicit join syntax. It's confusing and error-prone.
Change it to this:
SELECT
page.page_title, page.page_latest
, revision.rev_id, revision.rev_text_id
, text.old_id, text.old_text
FROM page
INNER JOIN revision ON (rev_id = page_latest)
INNER JOIN text ON (rev_text_id = old_id)
Now you can see why: it's getting all pages. There is no where
clause, there are just join
clauses.
This is the DB layout: http://upload.wikimedia.org/wikipedia/commons/b/b7/MediaWiki_database_schema_1-17_%28r82044%29.png
And here are the description of the fields in the various tables:
http://www.mediawiki.org/wiki/Manual:Database_layout
Revised query
SELECT
p.page_title, p.page_latest
, MAX(revision.rev_id) as rev_id, revision.rev_text_id
, text.old_id, text.old_text
FROM page p
INNER JOIN revision r ON (r.rev_id = p.page_latest)
INNER JOIN `text` t ON (r.rev_text_id = t.old_id)
WHERE p.page_is_redirect = 0 AND p.page_namespace <> 6 /*NS_IMAGE = 6*/
GROUP BY p.page_latest
ORDER BY p.page_title
This filters out the redirects and excludes the pages where namespace = ns_image
.
I'm not 100% sure though 'cause I don't have MediaWiki to test.
精彩评论