开发者

Compare two tables in Java

I have to compare two tables and tell the user what's the difference between two tables.

Table1

------+---------
|Code | Label   |  
------+---------  
|a1   | a1text  |  
------+---------  
|b1   | b1text  |  
------+---------  
|c1   | bartext1|  
------+---------  
|e1   | foo     |  
-----+--------- 

Table2

------+---------  
|Code | Label   |  
------+---------  
|a1   | a1text  |  
------+---------  
|b1   | b2text  |  
------+---------  
|d1   | bartext2|  
------+---------  
|f1   | bar     |  
------+--------- 

Comparison Info

As you see in table1 for Code c1 label is bartext1 and for Code d1 label is bartext2. They are same expect last character. I have to put i开发者_StackOverflow社区n my report they are same except last character. There are few rows in tables where there may be extra word or special character and its position is anywhere. Somehow I have to tell in the report both labels are same expect word is missing or there is a special character in one of them.Code is not important in report.

More Info

This data is coming from Third party.Code is always unique, they are no duplicate Codes. It may be possible that two codes have similar value like

Code | Label

ER4 | i have a brother

WE3 | i have a brother

Expected Output should be

  1. Labels are different in two tables. Table1 label is : b1text and Tabl2 label is: b2text.
  2. Labels are different in two tables. Table1 label is : bartext1 and Tabl2 label is: bartext2.
  3. label foo is missing in Table2.
  4. label bar is missing in Table1.


I'd create an abstraction for this Table object with an equals implementation that would hide all the details from clients. Java's an object-oriented language, so best to use objects for their raison d'etre.


This worked for me, please feel free to add salt to taste:

public final class ComparisonTest {

@Test
public void compare() throws Exception {
    String url = "your.url";
    String user = "your.user";
    String password = "your.password";
    // I am using Oracle here, but you can use any database
    Connection connection = getConnection(url, user, password, OracleDriver.class);

    ResultSet sourceResultSet = getResultSet(connection, "first_table");
    ResultSet targetResultSet = getResultSet(connection, "second_table");
    Map<Long, String> sourceIdHash = new HashMap<Long, String>();
    Map<Long, String> targetIdHash = new HashMap<Long, String>();

    try {
        long rows = 0;
        do {
            if (sourceResultSet.next()) {
                if (targetResultSet.next()) {
                    // Compare the lines
                    long sourceHash = hash(getRowValues(sourceResultSet, sourceResultSet.getMetaData()));
                    long targetHash = hash(getRowValues(targetResultSet, targetResultSet.getMetaData()));

                    sourceIdHash.put(sourceHash, sourceResultSet.getString(1));
                    targetIdHash.put(targetHash, targetResultSet.getString(1));

                    if (targetIdHash.containsKey(sourceHash)) {
                        targetIdHash.remove(sourceHash);
                        sourceIdHash.remove(sourceHash);
                    }
                    if (sourceIdHash.containsKey(targetHash)) {
                        sourceIdHash.remove(targetHash);
                        targetIdHash.remove(targetHash);
                    }
                } else {
                    // Add the source row
                    long sourceHash = hash(getRowValues(sourceResultSet, sourceResultSet.getMetaData()));
                    sourceIdHash.put(sourceHash, sourceResultSet.getString(1));
                }
            } else {
                if (targetResultSet.next()) {
                    // Add the target row
                    long targetHash = hash(getRowValues(targetResultSet, targetResultSet.getMetaData()));
                    targetIdHash.put(targetHash, targetResultSet.getString(1));
                } else {
                    break;
                }
            }
            if (rows++ % 10000 == 0) {
                System.out.println("Rows : " + rows);
            }
        } while (true);
    } finally {
        closeAll(sourceResultSet);
        closeAll(targetResultSet);
    }

    for (final Map.Entry<Long, String> mapEntry : sourceIdHash.entrySet()) {
        if (targetIdHash.containsKey(mapEntry.getKey())) {
            targetIdHash.remove(mapEntry.getKey());
            continue;
        }
        System.out.println("Not in target : " + mapEntry.getValue());
    }
    for (final Map.Entry<Long, String> mapEntry : targetIdHash.entrySet()) {
        if (sourceIdHash.containsKey(mapEntry.getKey())) {
            sourceIdHash.remove(mapEntry.getKey());
            continue;
        }
        System.out.println("Not in source : " + mapEntry.getValue());
    }

    System.out.println("In source and not target : " + sourceIdHash.size());
    System.out.println("In target and not source : " + targetIdHash.size());
}

private ResultSet getResultSet(final Connection connection, final String tableName) {
    String query = "select * from " + tableName + " order by pdb_key, organization_code, service_littera, day, resource_category";
    return executeQuery(connection, query);
}

private Object[] getRowValues(final ResultSet resultSet, final ResultSetMetaData resultSetMetaData) throws SQLException {
    List<Object> rowValues = new ArrayList<Object>();
    for (int i = 2; i < resultSetMetaData.getColumnCount(); i++) {
        rowValues.add(resultSet.getObject(i));
    }
    return rowValues.toArray(new Object[rowValues.size()]);
}

private final Connection getConnection(final String url, final String user, final String password, final Class<? extends Driver> driverClass) {
    try {
        DriverManager.registerDriver(driverClass.newInstance());
        return DriverManager.getConnection(url, user, password);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

private final ResultSet executeQuery(final Connection connection, final String query) {
    try {
        return connection.createStatement().executeQuery(query);
    } catch (SQLException e) {
        throw new RuntimeException(e);
    }
}

private final Long hash(final Object... objects) {
    StringBuilder builder = new StringBuilder();
    for (Object object : objects) {
        builder.append(object);
    }
    return hash(builder.toString());
}

public Long hash(final String string) {
    // Must be prime of course
    long seed = 131; // 31 131 1313 13131 131313 etc..
    long hash = 0;
    char[] chars = string.toCharArray();
    for (int i = 0; i < chars.length; i++) {
        hash = (hash * seed) + chars[i];
    }
    return Long.valueOf(Math.abs(hash));
}

private void closeAll(final ResultSet resultSet) {
    Statement statement = null;
    Connection connection = null;
    try {
        if (resultSet != null) {
            statement = resultSet.getStatement();
        }
        if (statement != null) {
            connection = statement.getConnection();
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
    close(resultSet);
    close(statement);
    close(connection);
}

private void close(final Statement statement) {
    if (statement == null) {
        return;
    }
    try {
        statement.close();
    } catch (Exception e) {
        e.printStackTrace();
    }
}

private void close(final Connection connection) {
    if (connection == null) {
        return;
    }
    try {
        connection.close();
    } catch (Exception e) {
        e.printStackTrace();
    }
}

private void close(final ResultSet resultSet) {
    if (resultSet == null) {
        return;
    }
    try {
        resultSet.close();
    } catch (Exception e) {
        e.printStackTrace();
    }
}

}


There's an Open Source Java framework that does that:

www.diffkit.org

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜