Query with JOIN FETCH performance problem

2023-02-09 17:41 问答作者：

I have problem with hibernate query performance which I can't figure out. In code snippet below I need select entities with at least one mapping and filtered mapping. I'm using FETCH JOIN for this to load only filtered ma开发者_StackOverflow社区ppings. But in that case I have performance problems with query. Hibernate says warning log :

org.hibernate.hql.ast.QueryTranslatorImpl - firstResult/maxResults specified with collection fetch; applying in memory!

When I omit FETCH JOIN and left only JOIN query is nice fast. But in result I have all mappings loaded to entity which is not acceptable state for me. Is there a way to boost query performance? There are a lot rows in mapping table.

HQL query :

select distinct e from Entity 
   join fetch e.mappings as mapping 
where e.deleted = 0 and e.mappings is not empty 
   and e = mapping.e and mapping.approval in (:approvals)

Entities :

@Entity
@Table(name="entity")
class Entity {

   ...

   @OneToMany(mappedBy="entity", cascade=CascadeType.REMOVE, fetch=FetchType.LAZY)
   @OrderBy("created")
   private List<Mapping> mappings = new ArrayList<Mapping>();

   ...
}

@Entity
@Table(name="mapping")
class Mapping {

public static enum MappingApproval {
    WAITING, // mapping is waiting for approval
    APPROVED, // mapping was approved
    DECLINED; // mapping was declined
}

...

    @ManyToOne(fetch=FetchType.EAGER)
    @JoinColumn(name="entity_id", nullable=false)
    private Entity entity;

    @Enumerated(EnumType.STRING)
    @Column(name="approval", length=20)
    private MappingApproval approval;

...

}

Thanks

From the JPA-Specifications

The effect of applying setMaxResults or setFirstResult to a query involving fetch joins over collections is undefined. (JPA "Enterprise JavaBeans 3.0, Final Release", Kapitel 3.6.1 Query Interface)

Hibernate does the right thing, but executes a part of the query in memory, which is tremendously slower. In my case the difference is between 3-5 ms to 400-500 ms.

My solution was to implement the paging within the query itself. Works fast with the JOIN FETCH.

If you need a firstResult/maxResults with "fetch" you can split your query in 2 queries:

Query your entity ids with firstResult/maxResults but without the "fetch" on sub-tables:
```
select entity.id from entity (without fetch) where .... (with firstResult/maxResults)
```
Query your entities with the "fetch" on the ids returned by your first query:
```
select entity from entity fetch ... where id in <previous ids>
```

The reason is slow is because Hibernate executes the SQL query with no pagination at all and the restriction is done in memory.

However, if the join has to scan and fetch 100k records, while you are interested in just 100 results, then 99.9% of the work being done by the Extractor and all the I/O done over networking is just waste.

You can easily turn a JPQL query that uses both JOIN FETCH and pagination:

List<Post> posts = entityManager.createQuery("""
    select p
    from Post p
    left join fetch p.comments
    where p.title like :title
    order by p.id
    """, Post.class)
.setParameter("title", titlePattern)
.setMaxResults(maxResults)
.getResultList();

into an SQL query that limits the result using DENSE_RANK by the parent identifier:

@NamedNativeQuery(
    name = "PostWithCommentByRank",
    query =
        "SELECT * " +
        "FROM (   " +
        "    SELECT *, dense_rank() OVER (ORDER BY \"p.created_on\", \"p.id\") rank " +
        "    FROM (   " +
        "        SELECT p.id AS \"p.id\", " +
        "               p.created_on AS \"p.created_on\", " +
        "               p.title AS \"p.title\", " +
        "               pc.id as \"pc.id\", " +
        "               pc.created_on AS \"pc.created_on\", " +
        "               pc.review AS \"pc.review\", " +
        "               pc.post_id AS \"pc.post_id\" " +
        "        FROM post p  " +
        "        LEFT JOIN post_comment pc ON p.id = pc.post_id " +
        "        WHERE p.title LIKE :titlePattern " +
        "        ORDER BY p.created_on " +
        "    ) p_pc " +
        ") p_pc_r " +
        "WHERE p_pc_r.rank <= :rank ",
    resultSetMapping = "PostWithCommentByRankMapping"
)
@SqlResultSetMapping(
    name = "PostWithCommentByRankMapping",
    entities = {
        @EntityResult(
            entityClass = Post.class,
            fields = {
                @FieldResult(name = "id", column = "p.id"),
                @FieldResult(name = "createdOn", column = "p.created_on"),
                @FieldResult(name = "title", column = "p.title"),
            }
        ),
        @EntityResult(
            entityClass = PostComment.class,
            fields = {
                @FieldResult(name = "id", column = "pc.id"),
                @FieldResult(name = "createdOn", column = "pc.created_on"),
                @FieldResult(name = "review", column = "pc.review"),
                @FieldResult(name = "post", column = "pc.post_id"),
            }
        )
    }
)

The query can be executed like this:

List<Post> posts = entityManager
.createNamedQuery("PostWithCommentByRank")
.setParameter(
    "titlePattern",
    "High-Performance Java Persistence %"
)
.setParameter(
    "rank",
    5
)
.unwrap(NativeQuery.class)
.setResultTransformer(
    new DistinctPostResultTransformer(entityManager)
)
.getResultList();

To transform the tabular result set back into an entity graph, you need a ResultTransformer which looks as follows:

public class DistinctPostResultTransformer
        extends BasicTransformerAdapter {
 
    private final EntityManager entityManager;
 
    public DistinctPostResultTransformer(
            EntityManager entityManager) {
        this.entityManager = entityManager;
    }
 
    @Override
    public List transformList(
            List list) {
             
        Map<Serializable, Identifiable> identifiableMap =
            new LinkedHashMap<>(list.size());
             
        for (Object entityArray : list) {
            if (Object[].class.isAssignableFrom(entityArray.getClass())) {
                Post post = null;
                PostComment comment = null;
 
                Object[] tuples = (Object[]) entityArray;
 
                for (Object tuple : tuples) {
                    if(tuple instanceof Identifiable) {
                        entityManager.detach(tuple);
 
                        if (tuple instanceof Post) {
                            post = (Post) tuple;
                        }
                        else if (tuple instanceof PostComment) {
                            comment = (PostComment) tuple;
                        }
                        else {
                            throw new UnsupportedOperationException(
                                "Tuple " + tuple.getClass() + " is not supported!"
                            );
                        }
                    }
                }
 
                if (post != null) {
                    if (!identifiableMap.containsKey(post.getId())) {
                        identifiableMap.put(post.getId(), post);
                        post.setComments(new ArrayList<>());
                    }
                    if (comment != null) {
                        post.addComment(comment);
                    }
                }
            }
        }
        return new ArrayList<>(identifiableMap.values());
    }
}

That's it!

after increasing memory for JVM things goes much better. After all I end with not using FETCH in queries.

Query with JOIN FETCH performance problem

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？