开发者

Decoding word-encoded Content-Disposition header file name in Objective-C

I am trying to retrieve a file name that can't be represented in ASCII from the content-d开发者_开发问答isposition header.

This file name is word-encoded. Below is the encoded file name:

=?UTF-8?Q?=C3=ABst=C3=A9_=C3=A9_=C3=BAm_n=C3=B4m=C3=A9?= =?UTF-8?Q?_a=C3=A7ent=C3=BAad=C3=B5.xlsx?=

How do I get the decoded file name (that actually is "ësté é úm nômé açentúadõ.xlsx")?

PS: I am looking for an Objective-C implementation.


You probably want to search for a MIME handling framework, but I searched online and came up with nothing, so....

I couldn't find an example online, so I'm just showing the algorithm here. It's not the best example since I'm making a big assumption. That being that the string is always UTF-8 Q-encoded.

Q-encoding is like URL-encoding (percent-encoding), which Foundation's NSString already has support for decoding. The only (practical) difference when decoding (there are bigger differences when encoding) is that % encodings are = encodings instead.

Then there's the lead-in and lead-out stuff. Each encoded block has the format =?charset-name?encoding-type? ... encoded string here ... ?=. You should really read the charset name is use that encoding, and you should really read the encoding-type, since it may be "Q" or "B" (Base64).

This example only works for Q-encoding (a subset of quoted-printable). You should be able to easily modify it to handle the different charsets and to handle Base64 encoding however.

#import <Foundation/Foundation.h>

int main(void) {
    NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

    NSString *encodedString = @"=?UTF-8?Q?=C3=ABst=C3=A9_=C3=A9_=C3=BAm_n=C3=B4m=C3=A9?= =?UTF-8?Q?_a=C3=A7ent=C3=BAad=C3=B5.xlsx?=";   
    NSScanner *scanner = [NSScanner scannerWithString:encodedString];
    NSString *buf = nil;
    NSMutableString *decodedString = [[NSMutableString alloc] init];

    while ([scanner scanString:@"=?UTF-8?Q?" intoString:NULL]
        || ([scanner scanUpToString:@"=?UTF-8?Q?" intoString:&buf] && [scanner scanString:@"=?UTF-8?Q?" intoString:NULL])) {
        if (buf != nil) {
            [decodedString appendString:buf];
        }

        buf = nil;

        NSString *encodedRange;

        if (![scanner scanUpToString:@"?=" intoString:&encodedRange]) {
            break; // Invalid encoding
        }

        [scanner scanString:@"?=" intoString:NULL]; // Skip the terminating "?="

        // Decode the encoded portion (naively using UTF-8 and assuming it really is Q encoded)
        // I'm doing this really naively, but it should work

        // Firstly I'm encoding % signs so I can cheat and turn this into a URL-encoded string, which NSString can decode
        encodedRange = [encodedRange stringByReplacingOccurrencesOfString:@"%" withString:@"=25"];

        // Turn this into a URL-encoded string
        encodedRange = [encodedRange stringByReplacingOccurrencesOfString:@"=" withString:@"%"];

        // Remove the underscores
        encodedRange = [encodedRange stringByReplacingOccurrencesOfString:@"_" withString:@" "];

        [decodedString appendString:[encodedRange stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]];
    }

    NSLog(@"Decoded string = %@", decodedString);

    [decodedString release];

    [pool drain];

    return 0;
}

This outputs:

chrisbook-pro:~ chris$ ./qp-decode 2010-12-01 18:54:42.903 qp-decode[9643:903] Decoded string = ësté é úm nômé açentúadõ.xlsx


Created an easier / successful method here using a trick involving NSString percent escapes..

https://stackoverflow.com/a/10888548/285694


I recently implemented a NSString category that decodes MIME Encoded-Word with either Q-encoding or B-encoding.

The code is available on GitHub and is briefly explained in this answer.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜