How to detect duplicate artifacts in artifactory
I know that artifactory uses checksum based storag开发者_运维知识库e and will only store one copy of an artifact even if I upload multiple identical ones under different names.
As I have many projects with version-anonymous but probably identical jars, I would like to know if there is any way of getting artifactory to tell me which artifacts are referenced under multiple ids.
While Artifactory has no existing feature that provides this info, it is actually quite easy to achieve with a small script that utilizes Artifactory's REST-API.
You can for example, write a tree walker (using the Folder Info resource) that maps checksums to files (file checksum can be obtained using the File Info resource).
Or if you use the Pro version of Artifactory, you can retrieve a list of all artifacts within a repository using the File List resource
Here's SQL to run against the PostGreSQL database. I haven't tried it with any other database.
select sha1_actual, node_name, node_path, repo, *
from nodes
where sha1_actual in
(
select sha1_actual
from nodes
where node_type != 0
group by sha1_actual
having count(1) > 1
)
order by sha1_actual
#!/bin/bash
#
# search in artifactory, lists duplicates angelos@unix.gr
#
search=$1
if [ "X$search" == "X" ]
then
echo "$0 <search item>"
exit 1
else
search=`echo $search |sed -e 's/ /\%20/gi' `
search="*${search}*"
fi
USER=`whoami`
PASS=${PASS:-somepass}
CREDS=${USER}:${PASS}
ARTIFACTORY=https://artifactory.somesite.com/artifactory
curl -s -u ${CREDS} -o search.txt ${ARTIFACTORY}/api/search/artifact\?name=${search}
echo "List of all uris is in search.txt"
echo "All instances of $search follow"
echo "---------------------------------"
grep $search search.txt | grep -v pom | awk '{ print $3}' | xargs -i basename {} | sort | uniq -c | sort -rn
精彩评论