idea for practice with shell scripting [closed]

2023-02-02 12:14 问答作者：

Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 9 years ago.

Improve this question

I'm looking for a shell script idea to work on for practice with shell scripting. Can you please suggest intermediate ideas to work on? I'm a develo开发者_C百科per and I prefer working on an idea that deals with files.

For shell scripting, think of a task that you do frequently - and think how you would automate that task.

You can start off with a basic script that just about does what you need. Then you realize that there are small variations on the task, and you start to allow the script to handle those. And it gently becomes more complex.

Almost all of the scripts I have (some hundreds of them) started off as "I've done that before; how can I avoid having to do it again?".

Can you give an example?

No - because I don't know what tasks you do sufficiently often to be (minor) irritants that could be salved by writing a script.
Yes - because I've got scripts that I wrote incrementally, in an attempt to work around some issue or other in my environment.

One task that I'm working on - still a work in progress - is:

Identify duplicate files

Starting at some nominated directory (default, $HOME), find all the files, and for each file, establish a checksum (MD5, SHA1, SHA256 - it is not critical which) for the file; record the file name and checksum (and maybe device number and inode number).
Establish which checksums are repeated - hence identifying identical files.
Eliminate the unique checksums.
Group the duplicate files together with appropriate identifying information.

This much is fairly easy - it requires some medium-grade shell scripting and you might have to find a command to generate the checksum (but you might be OK with sum or cksum, though neither of those reaches even the level of MD5). I've done this in both shell and Perl.

The hard part - where I've not yet gotten a good solution - is then dealing with the duplicates. I have some 8,500 duplicated hashes, with about 27,000 file names in total. Some of the duplicates are images like smileys used in chat transcripts - there are a lot of that particular image. Others are duplicate PDF files collected from various machines at various times; I need to organize them so I have one copy of the file on disk, with perhaps links in the other locations. But some of the other locations should go - they were convenient ways to get the material from retired machines onto my current machine.

I have not yet got a good solution to the second part.

Here are two scripts from my personal library. They are simple enough not to require a full blown programming language, but aren't trivial, particularly if you aim to get all the details right (support all flags, return same exit code, etc.).

cvsadd

Write a script to perform a recursive cvs add so you don't have to manually add each sub-directory and its files. Make it so it detects the file types and adds the -kb flag for binary files as needed.

For bonus points: Allow the user to optionally specify a list of directories or files to restrict the search to. Handle file names with spaces correctly. If you can't figure out if a file is text or binary, ask the user.

#!/bin/bash
#
# Usage: cvsadd [FILE]...
#
# Mass `cvs add' script. Adds files and directories recursively, automatically
# figuring out if they are text or binary. If no file names are specified, looks
# for unversioned files and directories in the current directory.

svnfind

Write a wrapper around find which performs the same job, recursively finding files matching arbitrary criteria, but ignores .svn directories.

For bonus points: Allow other actions besides the default -print. Support the -H, -L, and -P options. Don't erroneously filter out files which simply happen to contain the substring .svn. Make usage identical to the regular find command.

#!/bin/bash
#
# Usage: svnfind [-H] [-L] [-P] [path...] [expression]
#
# Attempts to behave identically to a plain `find' command while ignoring .svn
# directories. Usage is identical to `find'.

You could try some simple CGI scripting. It can be done in shell and involves a lot of here documents, parsing and extracting of form values, a bit of escaping and whatever you want to do as payload. (I do not recommend exposing such a script to the hostile internet, though.)

继续阅读：shell

idea for practice with shell scripting [closed]

Identify duplicate files

cvsadd

svnfind

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Identify duplicate files

cvsadd

svnfind

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？