JPA Query Optimization
Essential patterns for optimizing JPA queries with collections.
Core Principles
1. Default Lazy Loading with @BatchSize
Always use lazy loading for collections:
@Entity
class Product(
@OneToMany(
mappedBy = "product",
fetch = FetchType.LAZY
)
@BatchSize(size = 50)
var reviews: MutableSet<Review> = mutableSetOf()
)
Why: - Prevents loading unnecessary data - @BatchSize mitigates N+1 problems - Provides flexibility for different use cases
Batch size selection: - Small (10-20): Lower memory, more queries - Medium (40-100): Balanced (recommended) - Large (500+): Fewer queries, higher memory
2. Avoid Eager Loading
// ❌ BAD - Always loads everything
@OneToMany(fetch = FetchType.EAGER)
var reviews: Set<Review> = setOf()
// ✅ GOOD - Load on demand
@OneToMany(fetch = FetchType.LAZY)
@BatchSize(size = 50)
var reviews: MutableSet<Review> = mutableSetOf()
Problems with EAGER: - Cannot be overridden at runtime - Multiple eager collections create Cartesian products - Loads data even when not needed
Query Strategies
When to Modify Entities
Use JPQL with JOIN FETCH:
@Query("""
select distinct p from Product p
left join fetch p.reviews
where p.id = :id
""")
fun findByIdWithReviews(id: Long): Product?
Key points:
- Use DISTINCT to avoid duplicates
- Only fetch collections you actually need
- Wrap in @Transactional
Read-Only Operations
Use projections for better performance:
@Query(
nativeQuery = true,
value = """
select
p.id as productId,
p.name as productName,
p.price as price
from products p
where p.category_id = :categoryId
"""
)
fun findProductSummaries(categoryId: Long): List<ProductSummary>
interface ProductSummary {
val productId: Long
val productName: String
val price: BigDecimal
}
Benefits: - Faster - only loads needed columns - No entity management overhead - Great for DTOs and API responses
N+1 Query Problem
The Problem
// ❌ BAD - Triggers 1 + N queries
val products = productRepository.findAll()
products.forEach { product ->
println(product.reviews.size) // Triggers query per product!
}
Solution 1: @BatchSize (Already Applied)
@BatchSize(size = 50) // Reduces N queries to N/50 queries
var reviews: MutableSet<Review> = mutableSetOf()
Solution 2: JOIN FETCH
@Query("""
select distinct p from Product p
left join fetch p.reviews
where p.id in :ids
""")
fun findByIdsWithReviews(ids: List<Long>): List<Product>
Solution 3: Separate Bulk Query
// Query 1: Get products
val products = productRepository.findByCategory(categoryId)
// Query 2: Get all reviews in one query
val allReviews = reviewRepository.findByProductIds(products.map { it.id })
val reviewsByProduct = allReviews.groupBy { it.productId }
// Combine in memory
Cartesian Product Problem
The Problem
// ❌ BAD - Creates M × N rows!
@Query("""
select p from Product p
left join fetch p.reviews
left join fetch p.images
where p.id = :id
""")
If a product has 5 reviews and 3 images, you get 15 rows!
Solution: Separate Queries
// Query 1: Product with reviews
@Query("""
select distinct p from Product p
left join fetch p.reviews
where p.id = :id
""")
fun findWithReviews(id: Long): Product?
// Query 2: Product with images
@Query("""
select distinct p from Product p
left join fetch p.images
where p.id = :id
""")
fun findWithImages(id: Long): Product?
Projections with Collections
The Limitation
Native SQL projections cannot contain collections directly:
// ❌ This does NOT work
interface ProductProjection {
val name: String
val reviews: List<ReviewProjection> // Cannot do this!
}
Solution: Separate Queries
// Query 1: Product projection
val product = productRepository.findProductSummary(id)
// Query 2: Reviews projection
val reviews = reviewRepository.findByProductId(id)
// Combine in service
return ProductWithReviewsDto(
name = product.name,
reviews = reviews
)
Example Implementation
@Service
class ProductService(
private val productRepository: ProductRepository,
private val reviewRepository: ReviewRepository
) {
fun getProductDetails(id: Long): ProductDetailsDto {
// Query 1: Product data
val product = productRepository.findProductSummary(id)
?: throw NotFoundException()
// Query 2: Reviews
val reviews = reviewRepository.findReviewsByProductId(id)
// Combine
return ProductDetailsDto(
id = product.productId,
name = product.productName,
price = product.price,
reviews = reviews.map { ReviewDto(it.rating, it.comment) }
)
}
}
Bulk Operations
For multiple entities, avoid N+1 with IN clause:
@Query(
nativeQuery = true,
value = """
select
r.product_id as productId,
r.rating as rating,
r.comment as comment
from reviews r
where r.product_id in :productIds
"""
)
fun findByProductIds(productIds: List<Long>): List<ReviewProjection>
// Usage
val allReviews = reviewRepository.findByProductIds(productIds)
val reviewsByProduct = allReviews.groupBy { it.productId }
Transaction Management
Always use @Transactional when accessing lazy collections:
// ❌ BAD - LazyInitializationException
fun getReviews(productId: Long): List<String> {
val product = productRepository.findById(productId).get()
return product.reviews.map { it.comment } // Exception!
}
// ✅ GOOD
@Transactional(readOnly = true)
fun getReviews(productId: Long): List<String> {
val product = productRepository.findById(productId).get()
return product.reviews.map { it.comment }
}
Decision Tree
Need to modify the entity?
├─ YES → Use JPQL
│ └─ Need collections?
│ ├─ YES → JOIN FETCH
│ └─ NO → Simple query
│
└─ NO (read-only)
└─ Use Projections
└─ Need collections?
├─ YES → Separate queries
└─ NO → Single projection
Performance Checklist
- [ ] All
@OneToManyuseFetchType.LAZY - [ ] All collections have
@BatchSizeannotation - [ ] Use JOIN FETCH only when collections are needed
- [ ] Use projections for read-only operations
- [ ] Bulk queries with IN clause for multiple entities
- [ ]
@Transactionalon methods accessing lazy collections - [ ]
DISTINCTwith JOIN FETCH - [ ] Database indexes on foreign keys
Common Mistakes
1. Eager Loading Collections
// ❌ Bad
@OneToMany(fetch = FetchType.EAGER)
2. No @BatchSize
// ❌ Bad - Missing @BatchSize
@OneToMany(fetch = FetchType.LAZY)
var reviews: Set<Review> = setOf()
3. Loading Full Entities for Read Operations
// ❌ Bad - Loads everything
val products = productRepository.findAll()
return products.map { it.name }
// ✅ Good - Only loads names
@Query("select p.name from Product p")
fun findAllNames(): List<String>
4. Queries in Loops
// ❌ Bad - N queries
productIds.forEach { id ->
reviewRepository.findByProductId(id)
}
// ✅ Good - 1 query
reviewRepository.findByProductIds(productIds)
5. Missing DISTINCT with JOIN FETCH
// ❌ Bad - Returns duplicates
select p from Product p
left join fetch p.reviews
// ✅ Good
select distinct p from Product p
left join fetch p.reviews
Debugging Performance
Enable SQL Logging
spring:
jpa:
show-sql: true
properties:
hibernate:
format_sql: true
use_sql_comments: true
Look for These Patterns
N+1 Queries:
SELECT * FROM products WHERE ...;
SELECT * FROM reviews WHERE product_id = 1;
SELECT * FROM reviews WHERE product_id = 2;
-- Many sequential queries
Fix: Use @BatchSize, JOIN FETCH, or bulk queries
Cartesian Products:
-- Returns M × N rows instead of N rows
SELECT p.*, r.*, i.*
FROM products p
LEFT JOIN reviews r ON ...
LEFT JOIN images i ON ...
Fix: Use separate queries
Additional Resources
- Hibernate Documentation
- Spring Data JPA Reference
- Vlad Mihalcea's Blog - Excellent resource for JPA performance