이전 글에서 다루지 못했던 N+1 관련 내용들을 추가적으로 다룰 예정이다. 이번 글에서 다룰 내용은 다음과 같다.

Join 과 Fetch Join 의 차이
Paging
MultipleBagFetchException
DTO

엔티티

엔티티는 이전 글에서 사용한 Post, Comment 엔티티를 다시 가져오되, Post 엔티티의 연관관계에 Attachment 엔티티가 추가되었다.

@Entity
@Table(name = "post")
@Getter
@NoArgsConstructor
public class Post {
  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  @Column(length = 50, nullable = false)
  private String title;

  @Column(length = 200, nullable = false)
  private String content;

  @OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
  private List<Comment> comments = new ArrayList<Comment>();

  @OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
  private List<Attachment> attachments = new ArrayList<Attachment>();
}

@Entity
@Table(name = "comment")
@Getter
@NoArgsConstructor
public class Comment {
  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  @ManyToOne(fetch = FetchType.LAZY)
  @JoinColumn(name = "post_id", nullable = false)
  private Post post;

  @Column(length = 200, nullable = false)
  private String content;
}

추가적으로 Attachment 엔티티를 다룬다. Attachment 엔티티는 Post 엔티티와 양방향 연관관계를 갖는다.

@Entity
@Getter
@NoArgsConstructor
public class Attachment {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String filename;

    @ManyToOne
    @JoinColumn(name = "post_id", nullable = false)
    private Post post;
}

Join 과 Fetch Join 의 차이

join 은 SQL 에서 제공되는 join 이고, fetch join 은 SQL 이 아닌 JPQL 에서 제공되는 join 이다.

N+1 문제를 피할 수 있는 대표적인 방법 중 하나가 fetch join 인데 join 과 어떤 차이가 있을까?

join 은 대상 엔티티만 영속성 컨텍스트에 올리고, 연관관계 엔티티는 영속성 컨텍스트에 올리지 않는다.

@Repository
public interface PostRepository extends JpaRepository<Post, Long> {
  @Query("select p, c from Post p join p.comments c")
  List<Post> findAllPostWithCommentJoin();
}

-- 쿼리 로그
-- select 문에 연관관계인 Comment 엔티티도 포함되어 있다
select
    p1_0.id,
    p1_0.content,
    p1_0.title,
    c1_0.id,
    c1_0.content,
    c1_0.post_id 
from
    post p1_0 
join
    comment c1_0 
        on p1_0.id=c1_0.post_id

-- 그러나 Post 의 갯수만큼 Comment 를 조회하는 N+1 쿼리가 발생한다
select
    c1_0.post_id,
    c1_0.id,
    c1_0.content 
from
    comment c1_0 
where
    c1_0.post_id=?

select
    c1_0.post_id,
    c1_0.id,
    c1_0.content 
from
    comment c1_0 
where
    c1_0.post_id=?

select 문에 연관관계의 엔티티도 포함하면 되지 않을까? 포함하더라도 N+1 문제가 발생한다. 연관관계의 엔티티는 영속성 컨텍스트에 올리지 않아서 연관관계 엔티티의 필드에 접근할 경우 DB 에서 조회한다.

Paging

일대다

fetch join

fetch join 으로 조회할 경우 문제가 있다.

쿼리에 페이징이 적용되지 않는다. 페이징 없이 모두 조회한 후 메모리에서 페이징을 실행한다.

JPQL 의 setFirstResult, setMaxResults 메소드를 이용해서 페이징을 설정해도 실제 쿼리에는 페이징이 포함되지 않는다.

firstResult/maxResults specified with collection fetch; applying in memory

메모리에서 페이징을 수행한다고 로그에 나타난다.

join

-- 페이징 쿼리 발생
select
    p1_0.id,
    p1_0.content,
    p1_0.title 
from
    post p1_0 
join
    comment c1_0 
        on p1_0.id=c1_0.post_id 
limit
    ?, ?

-- Post 의 갯수만큼 Comment 를 조회하는 N + 1 쿼리 발생
select
    c1_0.post_id,
    c1_0.id,
    c1_0.content 
from
    comment c1_0 
where
    c1_0.post_id=?

join 으로 조회하면 페이징은 적용되는데 N + 1 쿼리가 발생한다.

앞서 fetch join 과 join 의 차이에서 언급한 것처럼 연관관계의 엔티티를 조회하지 않기 때문에 연관관계 엔티티의 필드에 접근하면 이때 추가 쿼리가 발생한다.

batch size

일대다 관계에서 페이징 쿼리를 할 때 join 을 사용할 경우 발생하는 N + 1 문제를 해결하기 위해 batch size 를 사용할 수 있다.

fetch join 을 사용하면 batch size 를 설정해도 메모리에서 페이징을 수행한다는 로그가 그대로 나온다. batch size 를 사용하려면 fetch join 을 사용하지 않아야 한다.

#application.properties
spring.jpa.properties.hibernate.default_batch_fetch_size=10

@Entity
@Table(name = "post")
@Getter
public class Post {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Column(length = 50, nullable = false)
    private String title;

    @Column(length = 200, nullable = false)
    private String content;

    @OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
    @BatchSize(size = 15)
    private List<Comment> comments = new ArrayList<Comment>();
}

application.properties 파일이나 엔티티 클래스에 작성할 수 있다. 둘 다 작성할 경우 엔티티 클래스에 작성한 내용이 적용된다.

-- 페이징 쿼리 발생
select
	p1_0.id,
	p1_0.content,
	p1_0.title 
from
	post p1_0 
join
	comment c1_0 
		on p1_0.id=c1_0.post_id 
limit
	?, ?

-- batch size 에 적용한 크기만큼 IN 쿼리 발생
select
	c1_0.post_id,
	c1_0.id,
	c1_0.content 
from
	comment c1_0 
where
	c1_0.post_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

batch size 를 사용할 경우 연관관계 엔티티에 대한 IN 쿼리가 발생한다.

IN 쿼리가 추가적으로 발생하긴 하지만 만약에 batch size 를 적용하지 않으면 주 엔티티의 갯수만큼 N + 1 쿼리가 발생하기 때문에 경우에 따라 훨씬 더 많은 쿼리가 발생할 수 있다. IN 쿼리는 한번만 발생하기 때문에 N + 1 쿼리보다 쿼리 숫자를 줄일 수 있다.

다대일

fetch join

select
    c1_0.id,
    c1_0.content,
    c1_0.post_id,
    p1_0.id,
    p1_0.content,
    p1_0.title 
from
    comment c1_0 
join
    post p1_0 
        on p1_0.id=c1_0.post_id 
where
    p1_0.id=? 
limit
    ?, ?

다대일(N:1) 관계에서 fetch join 으로 페이징이 가능하다. 일대다 관계에서와 달리 메모리에서 페이징을 시도하지 않는다.

join

경우에 따라서 N + 1 쿼리가 발생할 수 있다. 다만 where 절로 Post 의 종류를 하나로 제한하기 때문에 이때는 N + 1 쿼리라고 하더라도 1번의 추가 쿼리만 발생한다.

앞서 언급했던 것처럼 fetch join 이 아닌 join 으로 조회하면 연관관계의 엔티티는 영속성 컨텍스트에 등록되지 않는다. 그래서 영속성 컨텍스트에 등록되지 않은 연관관계 엔티티의 필드에 접근할 경우 추가 쿼리가 발생하게 된다.

@Service
public class CommentService {
  public List<CommentResponse> findAllCommentPaging(Long postId, Pageable pageable) {
    List<Comment> comments = commentEmRepository.findAllJoinLimitByEm(postId, pageable);
    return comments
      .stream()
      .map(CommentResponse::toDTO)
      .collect(Collectors.toList());
  }
}

@Repository
public class CommentEmRepository {
  public List<Comment> findAllJoinLimitByEm(Long postId, Pageable pageable) {
    TypedQuery<Comment> query = em.createQuery("select c from Comment c join c.post where c.post.id = :postId", Comment.class);
    query.setParameter("postId", postId);
    query.setFirstResult(pageable.getPageNumber());
    query.setMaxResults(pageable.getPageSize());
    return query.getResultList();
  }
}

public class CommentResponse {

    private final Long id;
    private final Long postId;
    private final String content;

    public static CommentResponse toDTO(Comment comment) {
        return CommentResponse
                .builder()
                .comment(comment)
                .build();
    }

    @Builder
    public CommentResponse(Comment comment) {
        this.id = comment.getId();
        this.postId = comment.getPost().getId();
        this.content = comment.getContent();
    }
}

경우에 따라서 N + 1 쿼리가 발생할 수 있다고 한 것처럼 N + 1 쿼리가 발생하지 않을 수도 있다. Service 에서 Repository 로 조회 요청을 하는데, Repository 는 join 을 이용해서 paging 을 처리한다. 그리고 조회한 결과를 CommentResponse 로 변환하는데 이때 CommentResponse 의 postId 는 Comment 와 연관관계인 Post 엔티티의 id 다.

join 의 결과에 따라서 Post 엔티티는 영속성 컨텍스트에 담기지 않고 Post 프록시 엔티티가 영속성 컨텍스트에 담기게 된다. 그런데 postId 는 Comment 테이블의 컬럼이다. Comment 엔티티의 post 필드는 실제로 테이블에서는 postId(post_id) 컬럼으로 등록되어 있다. postId 는 Post 엔티티의 필드면서 동시에 Comment 테이블의 필드 정보라서 Comment 만 조회해도 알 수 있는 정보다. 따라서 N + 1 쿼리가 발생하지 않는다.

public class CommentResponse {

    private final Long id;
    private final String postTitle;
    private final String content;

    public static CommentResponse toDTO(Comment comment) {
        return CommentResponse
                .builder()
                .comment(comment)
                .build();
    }

    @Builder
    public CommentResponse(Comment comment) {
        this.id = comment.getId();
        // Post 엔티티의 title 필드에 접근
        this.postTitle = comment.getPost().getTitle();
        this.content = comment.getContent();
    }
}

만약에 Post 엔티티의 id(Comment 테이블의 post_id 컬럼) 가 아니라 title 정보에 접근한다면 이때는 N + 1 쿼리가 발생한다.

MultipleBagFetchException

2개 이상의 일대다 관계의 테이블에 fetch join 을 사용할 경우 발생한다.

Post 엔티티에서 Comment, Attachment 엔티티를 함께 조회한다.

전체 조회

join

-- join 쿼리가 발생하긴 하지만 영속성 컨텍스트에 Comment, Attachment 를 올리지 않는다
select
    p1_0.id,
    p1_0.content,
    p1_0.title 
from
    post p1_0 
join
    comment c1_0 
        on p1_0.id=c1_0.post_id 
join
    attachment a1_0 
        on p1_0.id=a1_0.post_id

-- Post 의 갯수만큼 Comment 를 조회하는 쿼리가 발생한다
select
    c1_0.post_id,
    c1_0.id,
    c1_0.content 
from
    comment c1_0 
where
    c1_0.post_id=?

-- Post 의 갯수만큼 Attachment 를 조회하는 쿼리가 발생한다
select
    a1_0.post_id,
    a1_0.id,
    a1_0.filename 
from
    attachment a1_0 
where
    a1_0.post_id=?

@Service
public class PostService {
  public void findAllPostWithCommentAndAttachment() {
    List<Post> posts = postRepository.findAllPostWithCommentAndAttachmentNoJoinFetch();

    List<String> commentContents = posts
      .stream()
      .map(Post::getComments)
      .flatMap(Collection::stream)
      .map(Comment::getContent)
      .toList();

    List<String> attachmentFilenames = posts
      .stream()
      .map(Post::getAttachments)
      .flatMap(Collection::stream)
      .map(Attachment::getFilename)
      .toList();
  }
}

@Repository
public interface PostRepository extends JpaRepository<Post, Long> {
  @Query("select p from Post p join p.comments c join p.attachments a")
  List<Post> findAllPostWithCommentAndAttachmentNoJoinFetch();
}

join 을 이용해서 Post 와 일대다 연관관계에 있는 Comment, Attachment 를 전체 조회할 경우 N + 1 쿼리가 발생한다. join 은 연관관계 엔티티를 영속성 컨텍스트에 올리지 않기 때문에 연관관계 엔티티의 필드에 접근하기 전까지는 프록시 엔티티로 있다가 필드에 접근하면 실제 엔티티 정보를 영속성 컨텍스트에 등록하기 위해 DB 에 조회하는 쿼리가 발생한다.

이때 쿼리는 Post 의 갯수만큼 추가 쿼리가 Comment, Attachment 에 각각 발생한다. 사실상 N + N + 1 쿼리다. 예를 들어, Post 엔티티의 조회 결과가 2개라면 Comment 를 조회하는 쿼리가 2번, Attachment 를 조회하는 쿼리가 2번이 발생한다.

fetch join

org.hibernate.loader.MultipleBagFetchException: cannot simultaneously fetch multiple bags:

@Repository
public interface PostRepository extends JpaRepository<Post, Long> {
  @Query("select p from Post p join fetch p.comments join fetch p.attachments")
  List<Post> findAllPostWithCommentAndAttachmentJoinFetch();
}

join 이 아닌 fetch join 을 사용하면 MultipleBagFetchException 이 발생한다.

batch size

MultipleBagFetchException 이 발생하지 않으면서 가능한 쿼리 횟수를 줄이려면 batch size 를 사용할 수 있다.

fetch join 을 사용한 상태에서는 batch size 를 설정해도 MultipleBagFetchException 이 발생한다. batch size 를 적용하려면 fetch join 을 사용하지 않아야 한다.

@Entity
@NoArgsConstructor
public class Post {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Column(length = 50, nullable = false)
    private String title;

    @Column(length = 200, nullable = false)
    private String content;

    @BatchSize(size = 10)
    @OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
    private List<Comment> comments = new ArrayList<>();

    @BatchSize(size = 10)
    @OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
    private List<Attachment> attachments = new ArrayList<>();
}

@BatchSize 어노테이션을 이용해서 batch size 를 Comment, Attachment 에 각각 등록할 수 있다.

@Repository
public interface PostRepository extends JpaRepository<Post, Long> {
  @Query("select p from Post p join p.comments c join p.attachments a")
  List<Post> findAllPostWithCommentAndAttachmentNoJoinFetch();
}

@Service
public class PostService {
  public void findAllPostWithCommentAndAttachment() {
    List<Post> posts = postRepository.findAll();
  }
}

join 을 사용하거나 spring data jpa 의 내장 메서드인 findAll 도 가능하다.

-- Post 를 조회하는 쿼리가 발생한다 (join 쿼리가 나가지만 fetch join 이 아닌 join 으로 작성한 쿼리라 연관관계 엔티티는 영속성 컨텍스트에 등록하지 않는다)
select
    p1_0.id,
    p1_0.content,
    p1_0.title 
from
    post p1_0 
join
    comment c1_0 
        on p1_0.id=c1_0.post_id 
join
    attachment a1_0 
        on p1_0.id=a1_0.post_id

-- Comment 를 조회하는 in 쿼리가 발생한다
select
    c1_0.post_id,
    c1_0.id,
    c1_0.content 
from
    comment c1_0 
where
    c1_0.post_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

-- Attachment 를 조회하는 in 쿼리가 발생한다.
select
    a1_0.post_id,
    a1_0.id,
    a1_0.filename 
from
    attachment a1_0 
where
    a1_0.post_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

batch size 로 설정한 크기 만큼 in 쿼리가 발생한다. 조회된 Post 의 갯수가 batch size 보다 크다면 추가적인 in 쿼리가 발생하겠지만 batch size 를 Post 의 갯수보다 크게 설정하면 Comment, Attachment 를 각각 1번의 쿼리로 조회할 수 있다.

단일 조회

join

-- join 쿼리가 발생하긴 하지만 영속성 컨텍스트에 Comment, Attachment 를 올리지 않는다
select
    p1_0.id,
    p1_0.content,
    p1_0.title 
from
    post p1_0 
join
    comment c1_0 
        on p1_0.id=c1_0.post_id 
join
    attachment a1_0 
        on p1_0.id=a1_0.post_id 
where
    p1_0.id=?

-- Comment 를 조회하는 쿼리가 발생한다
select
    c1_0.post_id,
    c1_0.id,
    c1_0.content 
from
    comment c1_0 
where
    c1_0.post_id=?

-- Attachment 를 조회하는 쿼리가 발생한다.
select
    a1_0.post_id,
    a1_0.id,
    a1_0.filename 
from
    attachment a1_0 
where
    a1_0.post_id=?

@Service
public class PostService {
  public void findPostWithCommentAndAttachment(Long id) {
    Post post = postEmRepository
      .findPostWithCommentAndAttachmentNoJoinFetch(id);

    List<String> commentContents = post
      .getComments()
      .stream()
      .map(Comment::getContent)
      .toList();

    List<String> attachmentFilenames = post
      .getAttachments()
      .stream()
      .map(Attachment::getFilename)
      .toList();
  }
}

Post, Comment, Attachment 를 조회하는 쿼리가 각각 1회씩 발생한다.

fetch join

org.hibernate.loader.MultipleBagFetchException: cannot simultaneously fetch multiple bags:

@Repository
public interface PostRepository extends JpaRepository<Post, Long> {
  @Query("select p from Post p join fetch p.comments join fetch p.attachments where p.id = :id")
  Post findPostWithCommentAndAttachment(@Param("id") Long id);
}

fetch join 을 사용하면 MultipleBagFetchException 이 발생한다.

batch size

join 에 batch size 속성을 적용하면 전체 조회와 달리 단일 조회라 in 쿼리가 발생하지 않고, join 만 실행했을 때와 동일한 쿼리가 발생한다.

@Repository
public class PostEmRepository {
  public Post findPostNoJoinFetchWithCommentAndAttachment(Long postId) {
    return em.createQuery("select p from Post p join p.comments join p.attachments where p.id = :id", Post.class)
      .setParameter("id", postId)
      .getSingleResult();
  }
}

@Repository
public interface PostRepository extends JpaRepository<Post, Long> {
  @Query("select p from Post p join p.comments join p.attachments where p.id = :id")
  Post findPostWithCommentAndAttachmentNoJoinFetch(@Param("id") Long id);
}

JPQL, Spring Data JPA 모두 동일한 결과가 발생한다.

DTO

클라이언트에게 엔티티를 그대로 전달하기 보다는 DTO 로 변환해서 전달하는 방식을 주로 선택하게 된다. 클라이언트가 필요한게 엔티티의 모든 정보들이 아닐 수 있고, 클라이언트에게 공개하지 않아야하는 민감한 정보를 엔티티에 담고 있을 수도 있다.

예를 들어 클라이언트가 아이디, 비밀번호, 주소, 연락처 정보를 갖는 회원 엔티티에서 아이디만 조회하는 경우인데 회원 엔티티의 나머지 정보들까지 전달할 필요가 없다.

Could not write JSON: Infinite recursion (StackOverflowError)

@JsonIgnore

대상 엔티티를 DTO 로 변환하면서 연관관계 엔티티는 그대로 DTO 의 필드로 노출할 경우 주의해야 한다. 양방향 연관관계가 걸린 경우라면 @JsonIgnore 을 한쪽에 적용해야 한다.

그렇지 않은 경우 무한 루프에 빠질 수 있다. 대상 엔티티를 통해 연관관계 엔티티에 접근하면 연관관계 엔티티에 대상 엔티티가 필드로 명시됐기 때문에 대상 엔티티를 가져오기 위해 대상 엔티티에 접근한다. 대상 엔티티에는 연관관계 엔티티가 필드로 명시됐기 때문에 다시 연관관계 엔티티에 접근하면서 무한 루프에 빠진다.

RestController 는 ObjectMapper 를 통해 객체를 JSON 으로 변형하는데 @JsonIgnore 를 적용해 ObjectMapper 가 엔티티를 무한 참조하지 않도록 한다.

필드

@JsonIgnore 를 사용하거나 엔티티의 일부 필드만 사용한다. 엔티티의 정보를 노출하지 않는 이 방식이 DTO 를 활용하는 방식이다. @JsonIgnore 는 Could not write JSON: Infinite recursion (StackOverflowError) 에러가 발생하지 않도록 하는 방법이고 엔티티를 그대로 노출하게 된다.

다대일

다대일 연관관계에서 엔티티를 그대로 클라이언트에게 노출하지 않고 DTO 로 변환하는 경우다.

참조를 통해 연관관계 엔티티에 접근하는 방법은 N + 1 쿼리가 발생한다.

fetch join

fetch join 을 사용하는 방법은 크게 2가지로 나뉠 수 있다.

fetch join 으로 엔티티를 조회한 후 특정 필드만 DTO 로 변환하는 방법과 조회 자체를 DTO 로 실행하는 방법이 있다.

조회 자체를 DTO 로 실행하는 방법은 Querydsl 라이브러리를 사용하면 깔끔하게 작성할 수 있으나 직접 JPQL 로 작성할 경우 DTO 의 패키지 전체 경로를 작성해야 해서 코드 가독성이 떨어질 수 있다.

일대다

일대다 연관관계에서 엔티티를 그대로 클라이언트에게 노출하지 않고 DTO 로 변환하는 경우다.

fetch join + batch size

일대다(1:N) 연관관계에서 연관관계 엔티티(N)를 fetch join 하면 페이징을 할 수 없다. 게다가 일대다 연관관계에서는 2개 이상의 N 엔티티를 fetch join 할 수 없고 MultipleBagFetchException 이 발생한다.

연관관계 엔티티 중 N인 엔티티는 batch size 로 조회하고, 1인 엔티티는 fetch join 을 사용해서 조회할 수 있다.

엔티티를 조회한 후 DTO 로 변환하는 방법대신 직접 DTO 를 조회하는 방법을 사용하면 Querydsl 라이브러리를 사용하지 않고 JPQL 로 조회할 경우 DTO 의 패키지 전체 경로를 작성해야 해서 가독성이 떨어질 수 있다.

<참고>

https://cobbybb.tistory.com/18

https://velog.io/@heoseungyeon/Fetch-Join-vs-%EC%9D%BC%EB%B0%98-Joinfeat.DTO

https://tecoble.techcourse.co.kr/post/2020-10-21-jpa-fetch-join-paging/

https://jojoldu.tistory.com/457

김영한님 강의 - 실전! 스프링 부트와 JPA 활용2 - API 개발과 성능 최적화

JPA - N+1 문제 (0)	2024.09.20
@Controller & @RestController (1)	2024.08.29
JPA - @JoinColumn (0)	2024.04.11

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

about software development

티스토리 뷰

JPA - N + 1 (2)

엔티티