我应该如何解析数据从输入.mov文件和输出解析数据到单独的.txt文件


How should I parse data from input .mov file and output that parsed data into separate .txt file?

我有一个。mov文件,在文本编辑器中编辑时显示一堆信息。我想创建一个程序来提取信息并创建一个。txt文件,这样我就可以在每个月得到一个新的。mov文件时运行它。

下面是。mov文件中的信息…

[Categories]
1   hollywoodhd 1295    1295    Movies  1
2   mega    1095    1095    Movies  1
3   still   1095    1095    Movies  1
4   special 895 895 Movies  1
5   family  1095    1095    Movies  1
[Titles]
  hollywoodhd1 1 0 8046 0 919 PG-13 6712 1 identity_hd "(HD) Identity Thief" Disk 0 04/15/13 11/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
  hollywoodhd2 3 0 8016 0 930 PG 5347 1 escapep_hd "(HD) Escape from Planet Earth" Disk 0 04/01/13 10/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
  hollywoodhd3 1 0 8012 0 930 PG-13 5828 1 darkski_hd "(HD) Dark Skies" Disk 0 04/01/13 10/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
  hollywoodhd3 2 0 8007 0 928 PG-13 5735 1 guilttri_hd "(HD) The Guilt Trip" Disk 0 04/01/13 10/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
  hollywoodhd3 3 0 8013 0 928 PG-13 7813 1 jackreac_hd "(HD) Jack Reacher" Disk 0 04/01/13 10/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0
  hollywoodhd4 1 0 7993 0 919 PG-13 9500 1 lesmiser_hd "(HD) Les Miserables" Disk 0 03/06/13 09/01/13 0 0 0 0 0 0 1 1 0 16000000 H3 16:9 0 0

在[Titles]之后,我只需要将电影名称和评级放入文本文件中。如果可能的话,我希望提示用户选择从哪个文件读取,这样如果其他人运行该程序,任何人都可以键入文件的名称或路径。

我打算用c++。关键是让它自动化,这样我就可以每个月运行这个程序,只取出电影标题和评级,而不是复制粘贴到word中,然后删除所有其他的东西。

在Python中(并不是"正确"的方式),但足够通用:

dict = {} # Movie titles and ratings
# open up the .mov file, check for tell-tale patterns and pair them up
with open('yourfilehere.mov', 'r') as infile:
    for line in infile.readlines():
        for word in line.split():
            if word in ['G', 'PG', 'PG-13', 'R', 'X']:
                hold = word
            elif word[0] == '"' and word[-1] == '"':
                dict[word[1:-1]] = hold
# write the pairings into a new file, separated by a tab                
with open('output.txt', 'w') as outfile:
    for key in dict.keys():
        outfile.write(key + "'t" + dict[key] + "'n")

希望这足以为您自己的代码指明正确的方向。考虑寻找一个用于处理.mov元数据的库,或者至少找到一个关于.mov文件结构的良好资源。

我将给你一些伪代码来解析这个文件:

void Read(stringFileName)
{
     //very self explanatory
}
int FindStr(stringToFind, stringToSearchWith, charDelimeter)
{
    //process
    return indexOfDelimeter;
}
    //stringToFind          - what data in the text are you looking for
    //stringToSearchWith    - in what text should I look for this(assuming you stored the .txt file into a string
    //charDelimeter         - whats the next character after finding the stringToFind.
    //                      - this is important so that you can know whats the corresponding data in that string e.g myFavNum 10000000e-210
    //the delimeter is a `space`
int ConvertToInt(stringToSearchWith, index)
{
    //process
    return toInt;
}
    //stringToSearchWith - assuming that this is only a line in your .txt file,
    //                   - this is where you want to convert string into int
    //index              - where is the data in the string?

概要:

  1. Read

  2. Find

  3. Convert