What is the N+1 problem and how do you solve it?

N+1 occurs when fetching a parent entity triggers N additional queries for each child relationship. If you load 100 orders, each accessing order.getItems() fires a separate query - 101 queries total. Solutions: JOIN FETCH in JPQL ('SELECT o FROM Order o JOIN FETCH o.items'), @EntityGraph annotations, or batch fetching (@BatchSize). Always check your query count in development.

How does the JPA persistence context work?

The persistence context is a first-level cache that tracks entity state within a transaction. When you load an entity, it's 'managed' - changes are automatically detected and synchronized to the database at flush/commit. Same entity loaded twice returns the same instance. After transaction ends, entities become 'detached' - changes aren't tracked. EntityManager.merge() reattaches detached entities.

When should you use Criteria API vs JPQL?

Use JPQL for static, readable queries - it's like SQL but for entities. Use Criteria API for dynamic queries built at runtime (e.g., search with optional filters). Criteria is type-safe with the metamodel but verbose. JPQL is concise but string-based. For simple queries: JPQL. For dynamic filter combinations: Criteria. For complex reporting: consider native SQL.

50+ Hibernate & JPA Interview Questions 2025: Entities, N+1, Caching & More

Q: What is the difference between JPA and Hibernate?

JPA (Jakarta Persistence API) is a specification - it defines interfaces and annotations for ORM in Java. Hibernate is an implementation of that specification. JPA provides EntityManager, @Entity, @Table annotations. Hibernate implements these plus adds extras like @Formula, Criteria extensions, and caching. You write to JPA interfaces for portability; Hibernate (or EclipseLink, OpenJPA) provides the actual implementation.

Q: What is the difference between lazy and eager loading?

Lazy loading defers fetching related entities until accessed - saves memory but risks LazyInitializationException if session is closed. Eager loading fetches relationships immediately with the parent - simpler but can load unnecessary data. Default: @OneToMany and @ManyToMany are LAZY; @ManyToOne and @OneToOne are EAGER. Best practice: default to lazy, use JOIN FETCH when you know you need the data.

Q: What is the difference between optimistic and pessimistic locking?

Optimistic locking assumes conflicts are rare - uses @Version column, checks at commit time, throws OptimisticLockException on conflict. Good for high-read scenarios. Pessimistic locking acquires database locks immediately - LockModeType.PESSIMISTIC_WRITE blocks other transactions. Use for high-contention data where conflicts are likely. Optimistic is generally preferred for web applications.

Every Java backend application persists data. And in the Java ecosystem, that usually means JPA with Hibernate. The ORM abstraction lets you think in objects rather than SQL tables - but that abstraction can bite you when you don't understand what's happening underneath.

Interviewers know this. They'll ask about lazy loading, then probe whether you've actually debugged a LazyInitializationException in production. They'll ask about the N+1 problem, expecting you to describe how you've identified and fixed it. Surface-level JPA knowledge crumbles under these questions.

This guide covers 50+ Hibernate and JPA interview questions from fundamentals to performance optimization - the depth interviewers expect from serious Java backend developers.

JPA Fundamentals Questions
Entity Mapping Questions
Entity Relationship Questions
Querying Questions
Fetching Strategy Questions
Transaction and Locking Questions
Performance Optimization Questions
Common Pitfalls Questions

JPA Fundamentals Questions

Understanding the distinction between JPA as a specification and Hibernate as an implementation is foundational knowledge for any Java backend interview.

What is the difference between JPA and Hibernate?

JPA (Jakarta Persistence API, formerly Java Persistence API) is a specification that defines a standard set of interfaces and annotations for object-relational mapping in Java. It doesn't contain any implementation code - it's purely a contract that defines how ORM should work. Think of it like an interface in Java: it declares what methods exist, but doesn't provide the actual code.

Hibernate is the most popular implementation of the JPA specification. It provides the actual code that makes persistence work. When you call entityManager.persist(), Hibernate's code executes behind the scenes. Hibernate also offers features beyond the JPA spec, like @Formula for computed columns, additional caching options, and Hibernate-specific Criteria extensions.

JPA defines:

Annotations for mapping objects to tables (@Entity, @Table, @Column)
EntityManager interface for persistence operations
JPQL query language
Transaction management integration
Lifecycle callbacks

Hibernate provides:

Implementation of all JPA interfaces
Additional proprietary features
Performance optimizations
Extended caching capabilities

When possible, stick to JPA annotations for portability - but know that most production applications use Hibernate-specific features when needed.

What is the EntityManager and how do you use it?

The EntityManager is your primary interface for interacting with the persistence context in JPA. It's responsible for all CRUD operations on entities - creating, reading, updating, and deleting. Every persistence operation in JPA flows through an EntityManager instance, which manages the lifecycle of entities and their synchronization with the database.

In Spring applications, you typically inject the EntityManager using @PersistenceContext. This gives you a proxy that's properly scoped to the current transaction. Understanding the key EntityManager methods and their behaviors is essential for working with JPA effectively.

@Repository
public class UserRepository {
 
    @PersistenceContext
    private EntityManager em;
 
    public User save(User user) {
        if (user.getId() == null) {
            em.persist(user);  // INSERT - entity becomes managed
            return user;
        } else {
            return em.merge(user);  // UPDATE - returns managed copy
        }
    }
 
    public User findById(Long id) {
        return em.find(User.class, id);  // Returns managed entity or null
    }
 
    public void delete(User user) {
        em.remove(em.contains(user) ? user : em.merge(user));
    }
}

What is the persistence context and how does it work?

The persistence context is a first-level cache that tracks all entities loaded within a transaction. It's the core concept that makes JPA's "managed entity" behavior possible. When you load an entity through the EntityManager, it enters the persistence context and becomes "managed" - meaning JPA tracks all changes to it automatically.

One of the most important behaviors of the persistence context is identity management. If you load the same entity twice within a transaction, JPA returns the exact same object instance - not a copy. This ensures consistency within your transaction and enables the automatic dirty checking that makes JPA convenient.

@Transactional
public void demonstratePersistenceContext() {
    // Load user - entity is now "managed"
    User user1 = em.find(User.class, 1L);
 
    // Load same user again - returns SAME INSTANCE (cached)
    User user2 = em.find(User.class, 1L);
    assert user1 == user2;  // true - same object reference
 
    // Changes to managed entity are tracked
    user1.setEmail("new@example.com");
    // No explicit save needed - dirty checking detects the change
    // At transaction commit, UPDATE is executed automatically
}

What are the different entity states in JPA?

JPA entities exist in one of four lifecycle states, and understanding these states is crucial for avoiding common bugs. The state determines whether changes to an entity are tracked and synchronized with the database. Moving entities between states incorrectly is one of the most common sources of JPA-related bugs.

A NEW (transient) entity has just been created with new and isn't associated with any persistence context. A MANAGED entity is associated with a persistence context, and changes are automatically tracked. A DETACHED entity was previously managed but the persistence context closed (typically when the transaction ended). A REMOVED entity is scheduled for deletion at the next flush.

stateDiagram-v2
    [*] --> NEW: new Entity()
    NEW --> MANAGED: persist()
    MANAGED --> REMOVED: remove()
    REMOVED --> MANAGED: persist()
    MANAGED --> DETACHED: detach() / clear() / transaction ends
    DETACHED --> MANAGED: merge()
    REMOVED --> [*]: flush/commit

// NEW - just created, not associated with persistence context
User user = new User();
user.setEmail("test@example.com");
 
// MANAGED - associated with persistence context, changes tracked
em.persist(user);  // Now managed
 
// DETACHED - was managed, but persistence context closed
// (e.g., after transaction ends or em.detach())
User detachedUser = user;  // After transaction commits
 
// Re-attach with merge() - returns a NEW managed instance
User managedAgain = em.merge(detachedUser);
assert managedAgain != detachedUser;  // Different objects!
 
// REMOVED - scheduled for deletion
em.remove(managedAgain);

What happens if you modify a detached entity?

Nothing happens automatically. This is a critical point that trips up many developers. Once an entity becomes detached (typically after the transaction ends), it's no longer tracked by any persistence context. You can modify its fields all you want, but those changes exist only in memory - they won't be synchronized to the database.

To persist changes to a detached entity, you must explicitly call merge(). This copies the detached entity's state onto a new managed instance. Importantly, merge() returns a new managed object - the original detached entity remains detached. This is a common source of bugs when developers modify an entity, call merge, but continue working with the original detached reference.

// Common bug pattern
User detachedUser = getDetachedUserSomehow();
detachedUser.setEmail("new@email.com");
 
em.merge(detachedUser);  // Returns managed copy, but we ignore it!
detachedUser.setName("New Name");  // This change is lost!
 
// Correct pattern
User detachedUser = getDetachedUserSomehow();
detachedUser.setEmail("new@email.com");
 
User managedUser = em.merge(detachedUser);  // Keep the managed reference
managedUser.setName("New Name");  // This change will persist

Entity Mapping Questions

Entity mapping is where you define how Java objects correspond to database tables. Getting these mappings right is essential for a well-functioning JPA application.

How do you map a basic entity to a database table?

Entity mapping starts with the @Entity annotation, which tells JPA this class should be persisted. The @Table annotation specifies the database table name (defaulting to the class name if omitted). Every entity must have a primary key field marked with @Id, and you typically want JPA to generate IDs automatically using @GeneratedValue.

Beyond the basics, you'll use @Column to customize column mappings, @Enumerated for enum fields, and lifecycle callbacks like @PrePersist for automatic timestamp management. JPA requires either a no-arg constructor or appropriate access to set field values via reflection.

@Entity
@Table(name = "users")
public class User {
 
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
 
    @Column(name = "email", nullable = false, unique = true)
    private String email;
 
    @Column(name = "full_name", length = 100)
    private String fullName;
 
    @Enumerated(EnumType.STRING)
    @Column(name = "status")
    private UserStatus status;
 
    @Column(name = "created_at")
    private LocalDateTime createdAt;
 
    @PrePersist
    protected void onCreate() {
        createdAt = LocalDateTime.now();
    }
 
    // Getters and setters required by JPA
}

What are the key JPA mapping annotations?

JPA provides a comprehensive set of annotations for mapping entities to database structures. Each annotation serves a specific purpose, from basic column mapping to complex relationship definitions. Knowing which annotation to use in which situation is fundamental JPA knowledge.

Annotation	Purpose
`@Entity`	Marks class as JPA entity
`@Table`	Specifies table name (defaults to class name)
`@Id`	Marks primary key field
`@GeneratedValue`	Auto-generation strategy for ID
`@Column`	Column mapping and constraints
`@Enumerated`	Enum storage (STRING or ORDINAL)
`@Temporal`	Date/time type (legacy, use java.time)
`@Transient`	Exclude field from persistence
`@Lob`	Large object (BLOB/CLOB)
`@Embedded` / `@Embeddable`	Composite value types

What are the different ID generation strategies and when should you use each?

JPA provides four ID generation strategies, each with different trade-offs for performance, portability, and database compatibility. The choice of strategy can significantly impact batch insert performance and application behavior, so understanding the implications of each is important.

IDENTITY relies on database auto-increment columns. It's simple and widely supported, but batch inserts are less efficient because Hibernate needs the generated ID immediately after each insert, requiring a round-trip to the database for every row.

SEQUENCE uses database sequences and is the most efficient for batch operations. Hibernate can pre-allocate IDs using the allocationSize parameter, eliminating per-insert round-trips. However, not all databases support sequences (MySQL notably doesn't have native sequences before 8.0).

TABLE simulates sequences using a database table. It's portable across all databases but has performance overhead and potential contention issues.

AUTO lets Hibernate choose the best strategy for your database.

// IDENTITY: Database auto-increment (MySQL, PostgreSQL SERIAL)
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
// Pros: Simple, database handles it
// Cons: Batch inserts less efficient (needs round-trip for each ID)
 
// SEQUENCE: Database sequence (PostgreSQL, Oracle)
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "user_seq")
@SequenceGenerator(name = "user_seq", sequenceName = "user_sequence", allocationSize = 50)
private Long id;
// Pros: Batch-friendly (can pre-allocate IDs)
// Cons: Not all databases support sequences
 
// TABLE: Simulated sequence using a table
@GeneratedValue(strategy = GenerationType.TABLE)
private Long id;
// Pros: Portable across databases
// Cons: Performance overhead, potential contention
 
// AUTO: Let Hibernate decide based on database
@GeneratedValue(strategy = GenerationType.AUTO)
private Long id;
// Hibernate picks the best strategy for your database

Why might IDENTITY strategy hurt batch insert performance?

With IDENTITY strategy, the database generates the ID during the INSERT statement itself. This creates a fundamental problem for batch operations: Hibernate needs to know the generated ID immediately after insert to properly manage the entity in the persistence context. This requirement forces Hibernate to execute each INSERT individually and retrieve the generated key, eliminating the possibility of true batch inserts.

Consider inserting 1000 entities. With IDENTITY, Hibernate executes 1000 individual INSERT statements, each followed by a call to retrieve the generated ID. With SEQUENCE and an allocationSize of 50, Hibernate can fetch 50 IDs at once with a single sequence call, then batch the actual INSERTs together. The difference in performance can be dramatic - sometimes 10x or more for large batch operations.

// With IDENTITY - each insert is separate
em.persist(user1);  // INSERT + fetch ID
em.persist(user2);  // INSERT + fetch ID
em.persist(user3);  // INSERT + fetch ID
// 3 round-trips minimum
 
// With SEQUENCE (allocationSize=50)
em.persist(user1);  // Uses pre-allocated ID
em.persist(user2);  // Uses pre-allocated ID
em.persist(user3);  // Uses pre-allocated ID
// All can be batched into single statement

Entity Relationship Questions

Relationships are where JPA complexity really emerges. Understanding the different relationship types and their configurations is essential for effective JPA development.

What are the different relationship types in JPA?

JPA supports four relationship types that map to common database relationship patterns. Each type has different default fetch behaviors and configuration options. Choosing the right relationship type and configuring it properly is crucial for both correctness and performance.

@ManyToOne represents many children pointing to one parent - the most common relationship type. The foreign key lives in the child table. Default fetch is EAGER, which you should almost always override to LAZY.

@OneToMany is the inverse of ManyToOne - one parent with many children. Default fetch is LAZY. Use mappedBy to indicate the owning side.

@OneToOne represents a one-to-one relationship. Often shares the primary key between tables using @MapsId. Default fetch is EAGER.

@ManyToMany requires a join table to represent the relationship. Default fetch is LAZY.

// @ManyToOne - Many orders belong to one user
@Entity
public class Order {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
 
    @ManyToOne(fetch = FetchType.LAZY)  // Override default EAGER!
    @JoinColumn(name = "user_id", nullable = false)
    private User user;
}
 
// @OneToMany - One user has many orders
@Entity
public class User {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
 
    @OneToMany(mappedBy = "user", cascade = CascadeType.ALL, orphanRemoval = true)
    private List<Order> orders = new ArrayList<>();
 
    // Helper methods for bidirectional consistency
    public void addOrder(Order order) {
        orders.add(order);
        order.setUser(this);
    }
 
    public void removeOrder(Order order) {
        orders.remove(order);
        order.setUser(null);
    }
}

What is the difference between unidirectional and bidirectional relationships?

A unidirectional relationship means only one side of the relationship knows about the other. A bidirectional relationship means both entities have a reference to each other. The choice between them affects navigation capabilities, cascade behavior, and code complexity.

Unidirectional relationships are simpler and sufficient when you only need to navigate in one direction. For example, if you always load orders first and then need to find their user, a unidirectional @ManyToOne from Order to User is enough - you don't need User to have an orders collection.

Bidirectional relationships add complexity but enable navigation from both sides and are required for cascading operations from parent to children. They require careful maintenance of both sides to keep the object model consistent.

// Unidirectional - Only Order knows about User
@Entity
public class Order {
    @ManyToOne
    private User user;
}
 
@Entity
public class User {
    // No @OneToMany orders - can't navigate from User to Orders
}
 
// Bidirectional - Both sides know about the relationship
@Entity
public class User {
    @OneToMany(mappedBy = "user")
    private List<Order> orders;
}
 
@Entity
public class Order {
    @ManyToOne
    private User user;
}

Use unidirectional when you rarely navigate from parent to children, when the child collection would be huge, or when you want simpler entity design. Use bidirectional when you frequently need to navigate both directions or need cascade operations from parent.

What does the mappedBy attribute do?

The mappedBy attribute indicates which side of a bidirectional relationship is the "inverse" (non-owning) side. In any bidirectional relationship, one side must be designated as the owner - this is the side that controls the foreign key in the database. The mappedBy attribute goes on the inverse side, pointing to the field name on the owning side.

This distinction matters because only changes to the owning side are persisted to the database. If you only set the relationship on the inverse side, the foreign key column won't be updated. This is one of the most common sources of relationship bugs in JPA applications.

// User side - inverse (not owning)
@OneToMany(mappedBy = "user")  // "user" refers to Order.user field
private List<Order> orders;
 
// Order side - owning (has the foreign key)
@ManyToOne
@JoinColumn(name = "user_id")  // This table has the FK column
private User user;

// WRONG - won't create the relationship in DB
user.getOrders().add(order);
// The FK column in orders table won't be set!
 
// RIGHT - set on owning side
order.setUser(user);
// FK column now populated
 
// BEST - set both sides for object consistency
order.setUser(user);
user.getOrders().add(order);

What are cascade types and when should you use each?

Cascade types determine which operations on a parent entity automatically propagate to its children. They simplify code by eliminating the need to explicitly persist or remove child entities, but they can also cause unexpected behavior if misconfigured. Understanding each cascade type is essential for proper relationship management.

Cascade Type	Propagates
PERSIST	persist() - insert children with parent
MERGE	merge() - update children with parent
REMOVE	remove() - delete children with parent
REFRESH	refresh() - reload children with parent
DETACH	detach() - detach children with parent
ALL	All of the above

@OneToMany(mappedBy = "user", cascade = CascadeType.ALL)
private List<Order> orders;

Be careful with CascadeType.REMOVE on @ManyToOne relationships - deleting a child could cascade to delete the parent and all other children!

What is the difference between CascadeType.REMOVE and orphanRemoval?

Both result in child entities being deleted, but they trigger under different conditions. CascadeType.REMOVE deletes children when the parent is deleted. orphanRemoval deletes children when they're removed from the parent's collection, even if the parent isn't deleted. OrphanRemoval is stricter - it enforces that children can't exist without their parent.

@OneToMany(mappedBy = "user", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Order> orders;
 
// With orphanRemoval = true
user.getOrders().remove(order);
// Order is deleted from database, not just unlinked
 
// With only CascadeType.REMOVE (no orphanRemoval)
user.getOrders().remove(order);
// Order remains in database with null user_id (or constraint violation)

Querying Questions

JPA provides multiple approaches to querying data. Understanding when to use each approach is key to writing maintainable and performant data access code.

What is JPQL and how does it differ from SQL?

JPQL (Java Persistence Query Language) is an object-oriented query language that operates on entities rather than database tables. While its syntax resembles SQL, JPQL queries reference entity class names and field names, not table and column names. This abstraction makes queries portable across different databases and keeps them aligned with your object model.

When you write a JPQL query, Hibernate translates it to the appropriate SQL for your database. JPQL supports joins, aggregates, subqueries, and most SQL features, but expressed in terms of your entity model. This means if you rename a database column but keep the entity field name the same, your JPQL queries continue to work unchanged.

// Basic JPQL query - note we query User entity, not users table
String jpql = "SELECT u FROM User u WHERE u.status = :status";
List<User> users = em.createQuery(jpql, User.class)
    .setParameter("status", UserStatus.ACTIVE)
    .getResultList();
 
// Join query using entity relationships
String jpql = """
    SELECT o FROM Order o
    JOIN o.user u
    WHERE u.email = :email
    AND o.status = :status
    ORDER BY o.createdAt DESC
    """;
 
// Aggregate functions work like SQL
String jpql = "SELECT COUNT(o), SUM(o.total) FROM Order o WHERE o.user.id = :userId";
Object[] result = em.createQuery(jpql, Object[].class)
    .setParameter("userId", userId)
    .getSingleResult();
Long count = (Long) result[0];
BigDecimal total = (BigDecimal) result[1];

What are Named Queries and why would you use them?

Named Queries are JPQL queries defined statically on entity classes using the @NamedQuery annotation. They're parsed and validated at application startup, catching syntax errors early rather than at runtime. They also provide a central location for queries related to an entity, making the codebase more organized and queries more reusable.

Because Named Queries are validated at startup, you get immediate feedback if a query has syntax errors or references non-existent fields. This is particularly valuable in large applications where a typo in a query string might not be discovered until that specific code path executes in production.

@Entity
@NamedQueries({
    @NamedQuery(
        name = "User.findByStatus",
        query = "SELECT u FROM User u WHERE u.status = :status"
    ),
    @NamedQuery(
        name = "User.findByEmailDomain",
        query = "SELECT u FROM User u WHERE u.email LIKE :domain"
    )
})
public class User {
    // ...
}
 
// Usage
List<User> users = em.createNamedQuery("User.findByStatus", User.class)
    .setParameter("status", UserStatus.ACTIVE)
    .getResultList();

When should you use the Criteria API instead of JPQL?

The Criteria API provides a programmatic, type-safe way to build queries. Unlike JPQL strings, Criteria queries are constructed using Java code, which means the compiler can catch many errors that would only surface at runtime with JPQL. The Criteria API really shines for dynamic queries where the query structure varies based on runtime conditions.

Consider a search form with optional filters for name, email, status, and date range. With JPQL, you'd need string concatenation to build the query, which is error-prone and ugly. With Criteria API, you build predicates conditionally and combine them cleanly. The trade-off is verbosity - simple queries are much more readable in JPQL.

CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<User> cq = cb.createQuery(User.class);
Root<User> user = cq.from(User.class);
 
// Build predicates dynamically based on search criteria
List<Predicate> predicates = new ArrayList<>();
 
if (status != null) {
    predicates.add(cb.equal(user.get("status"), status));
}
if (email != null) {
    predicates.add(cb.like(user.get("email"), "%" + email + "%"));
}
if (minAge != null) {
    predicates.add(cb.greaterThanOrEqualTo(user.get("age"), minAge));
}
 
cq.where(predicates.toArray(new Predicate[0]));
cq.orderBy(cb.desc(user.get("createdAt")));
 
List<User> results = em.createQuery(cq).getResultList();

For compile-time type safety, use the JPA metamodel:

// Generated metamodel class User_
cq.where(cb.equal(user.get(User_.status), status));  // Type-safe!
// User_.status is a generated constant, not a string

When should you use native SQL queries?

Native queries let you write raw SQL when JPQL's capabilities are insufficient. This includes database-specific features like window functions, complex CTEs, specific index hints, or when you need maximum performance for a critical query. Native queries bypass JPA's abstraction layer, giving you direct control over the SQL.

The trade-off is portability - native queries may not work if you switch databases. They also require manual result mapping if you're not returning complete entities. Use native queries sparingly, typically for reporting queries or performance-critical operations where JPQL falls short.

// Native SQL query returning entities
String sql = """
    SELECT u.* FROM users u
    WHERE u.created_at >= NOW() - INTERVAL '30 days'
    AND u.status = 'ACTIVE'
    ORDER BY u.created_at DESC
    LIMIT 100
    """;
 
List<User> users = em.createNativeQuery(sql, User.class)
    .getResultList();
 
// With result mapping for projections (DTOs)
@SqlResultSetMapping(
    name = "UserSummaryMapping",
    classes = @ConstructorResult(
        targetClass = UserSummary.class,
        columns = {
            @ColumnResult(name = "id", type = Long.class),
            @ColumnResult(name = "email", type = String.class),
            @ColumnResult(name = "order_count", type = Long.class)
        }
    )
)
 
String sql = """
    SELECT u.id, u.email, COUNT(o.id) as order_count
    FROM users u
    LEFT JOIN orders o ON o.user_id = u.id
    GROUP BY u.id, u.email
    """;
 
List<UserSummary> summaries = em.createNativeQuery(sql, "UserSummaryMapping")
    .getResultList();

When should you use projections instead of full entities?

Projections return only the data you need, rather than loading complete entity objects. This improves performance by transferring less data from the database and creating smaller objects in memory. Projections are ideal for read-only scenarios where you don't need to modify entities or navigate their relationships.

Use projections for reports, lists, dropdowns, and API responses that don't expose all entity fields. Fetch full entities when you need to modify and save them, when you need lazy-loaded relationships, or when business logic requires the complete object.

// DTO projection with JPQL
String jpql = """
    SELECT new com.example.UserDTO(u.id, u.email, u.fullName)
    FROM User u
    WHERE u.status = :status
    """;
 
List<UserDTO> dtos = em.createQuery(jpql, UserDTO.class)
    .setParameter("status", UserStatus.ACTIVE)
    .getResultList();
 
// Interface projection (Spring Data JPA)
public interface UserEmailProjection {
    Long getId();
    String getEmail();
}
 
@Query("SELECT u.id as id, u.email as email FROM User u WHERE u.status = :status")
List<UserEmailProjection> findEmailsByStatus(@Param("status") UserStatus status);

Fetching Strategy Questions

Fetching strategies determine when and how related entities are loaded. Getting this right is crucial for JPA performance - and getting it wrong causes the most common JPA performance issues.

What is the difference between lazy and eager loading?

Lazy loading defers fetching related entities until they're actually accessed. When you load an Order, its items collection isn't loaded from the database. Only when your code calls order.getItems() does JPA execute the query to fetch the items. This saves memory and database round-trips when you don't need the related data.

Eager loading fetches related entities immediately along with the parent. When you load an Order with eager items, JPA executes a join or additional query to load items right away. This is simpler - the data is always there when you need it - but can load unnecessary data and cause performance issues.

// LAZY - loads when accessed
@OneToMany(mappedBy = "user", fetch = FetchType.LAZY)
private List<Order> orders;
// SQL: SELECT * FROM users WHERE id = ?
// orders not loaded yet
 
user.getOrders().size();  // NOW triggers query
// SQL: SELECT * FROM orders WHERE user_id = ?
 
// EAGER - loads immediately with parent
@ManyToOne(fetch = FetchType.EAGER)
private User user;
// SQL: SELECT * FROM orders o JOIN users u ON o.user_id = u.id WHERE o.id = ?
// user loaded with order

What are the default fetch types for each relationship?

JPA's default fetch types are based on reasonable assumptions about typical usage patterns, but they're often not optimal for your specific use case. Understanding these defaults helps you make intentional decisions about fetching behavior.

Relationship	Default	Reason
@ManyToOne	EAGER	Usually need the parent
@OneToOne	EAGER	Usually need the related entity
@OneToMany	LAZY	Collection could be huge
@ManyToMany	LAZY	Collection could be huge

Best practice: Make everything LAZY and explicitly fetch what you need.

@ManyToOne(fetch = FetchType.LAZY)  // Override default EAGER
@JoinColumn(name = "user_id")
private User user;

What is the N+1 problem and why is it so common?

The N+1 problem occurs when loading a list of entities triggers additional queries for each entity's relationships. If you load 100 orders and then access each order's user, you execute 1 query for the orders plus 100 queries for the users - hence "N+1". This is the single most common JPA performance problem.

The N+1 problem is common because the code looks innocent. You write a simple loop that accesses a relationship, and everything works correctly. You don't realize there's a performance issue until you test with realistic data volumes or notice slow response times in production.

// This code has N+1 problem
List<Order> orders = em.createQuery("SELECT o FROM Order o", Order.class)
    .getResultList();
// SQL: SELECT * FROM orders (1 query)
 
for (Order order : orders) {
    System.out.println(order.getUser().getEmail());
    // SQL: SELECT * FROM users WHERE id = ? (N queries!)
}
// Total: 1 + N queries

If you load 100 orders, you execute 101 queries. With 1000 orders, 1001 queries.

How do you solve the N+1 problem with JOIN FETCH?

JOIN FETCH is the most common solution to N+1. It tells JPA to load the relationship in the same query as the parent entity, using a SQL join. Instead of N+1 queries, you get a single query that retrieves all the data at once.

String jpql = "SELECT o FROM Order o JOIN FETCH o.user";
List<Order> orders = em.createQuery(jpql, Order.class)
    .getResultList();
// SQL: SELECT o.*, u.* FROM orders o JOIN users u ON o.user_id = u.id
// Single query, users loaded with orders
 
for (Order order : orders) {
    System.out.println(order.getUser().getEmail());  // No additional query
}

Be careful with multiple collection fetches - they can cause cartesian products:

// WARNING: Multiple collection fetches can cause cartesian product
String jpql = """
    SELECT o FROM Order o
    JOIN FETCH o.items
    JOIN FETCH o.payments
    """;
// If order has 3 items and 2 payments:
// Returns 6 rows per order (3 * 2), duplicating data

What is @EntityGraph and how does it help?

Entity graphs provide a declarative way to specify which relationships to fetch without modifying your query. You define a graph of attributes to load, then apply it to queries. This separates the "what to fetch" concern from the query logic itself, making it easier to reuse queries with different fetch patterns.

Entity graphs are especially useful with Spring Data JPA, where you can annotate repository methods to use specific graphs or define ad-hoc graphs inline.

@Entity
@NamedEntityGraph(
    name = "Order.withUserAndItems",
    attributeNodes = {
        @NamedAttributeNode("user"),
        @NamedAttributeNode("items")
    }
)
public class Order {
    // ...
}
 
// Usage with EntityManager
EntityGraph<?> graph = em.getEntityGraph("Order.withUserAndItems");
Map<String, Object> hints = Map.of("javax.persistence.fetchgraph", graph);
Order order = em.find(Order.class, orderId, hints);
 
// Usage with Spring Data JPA
@EntityGraph(value = "Order.withUserAndItems")
List<Order> findByStatus(OrderStatus status);
 
// Ad-hoc entity graph
@EntityGraph(attributePaths = {"user", "items"})
List<Order> findByUserId(Long userId);

How does @BatchSize help with N+1?

@BatchSize is a Hibernate-specific annotation that reduces N+1 by loading related entities in batches rather than one at a time. Instead of executing N individual queries for each relationship, Hibernate batches them using IN clauses, drastically reducing the query count.

With 100 users and @BatchSize(size = 25), instead of 100 queries for orders, Hibernate executes 4 queries - each loading orders for 25 users at once. This is less optimal than JOIN FETCH (still multiple queries) but requires no query changes and works globally.

@Entity
public class User {
    @OneToMany(mappedBy = "user")
    @BatchSize(size = 25)  // Hibernate-specific
    private List<Order> orders;
}
 
// Without @BatchSize: 100 users = 100 queries for orders
// With @BatchSize(25): 100 users = 4 queries (25 users per query)
// SQL: SELECT * FROM orders WHERE user_id IN (?, ?, ?, ... 25 params)

You can set a global default in configuration:

spring.jpa.properties.hibernate.default_batch_fetch_size=25

How do you identify N+1 problems in an existing application?

N+1 problems are often invisible during development with small data sets. You need specific techniques to identify them before they cause production issues.

Enable SQL logging: spring.jpa.show-sql=true or logging.level.org.hibernate.SQL=DEBUG
Watch for repeated similar queries - same query structure with different parameter values
Use datasource-proxy to count queries per request
Enable Hibernate statistics: hibernate.generate_statistics=true
Test with realistic data volumes - N+1 often invisible with 5 records, obvious with 500

// Simple query counter for tests
@Test
void shouldLoadOrdersWithoutN1() {
    Statistics stats = entityManager.unwrap(Session.class)
        .getSessionFactory().getStatistics();
    stats.setStatisticsEnabled(true);
    stats.clear();
 
    List<Order> orders = orderService.findAllWithUsers();
    orders.forEach(o -> o.getUser().getEmail());
 
    assertThat(stats.getQueryExecutionCount()).isLessThanOrEqualTo(2);
}

Transaction and Locking Questions

Transactions ensure data consistency, and locking prevents concurrent modification issues. Understanding both is essential for building reliable JPA applications.

How does @Transactional work in Spring?

Spring's @Transactional annotation wraps method execution in a database transaction. When the method begins, Spring starts a transaction (or joins an existing one). When the method completes successfully, the transaction commits. If a RuntimeException is thrown, the transaction rolls back automatically.

The annotation works through AOP proxies, which has implications for self-calls (calling a transactional method from the same class won't start a new transaction). You can configure propagation behavior to control how transactions interact, and isolation levels to control visibility of concurrent changes.

@Service
public class OrderService {
 
    @Transactional
    public Order createOrder(CreateOrderRequest request) {
        Order order = new Order(request);
        return orderRepository.save(order);
        // Transaction commits automatically on method success
        // Rolls back on RuntimeException
    }
 
    @Transactional(readOnly = true)
    public List<Order> findOrders(OrderCriteria criteria) {
        // readOnly hint enables optimizations:
        // - No dirty checking
        // - Possible replica routing
        return orderRepository.findByCriteria(criteria);
    }
 
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    public void logAuditEvent(AuditEvent event) {
        // Runs in NEW transaction
        // Commits even if caller's transaction rolls back
        auditRepository.save(event);
    }
}

What are the different transaction propagation types?

Transaction propagation controls how a transactional method behaves when called from another transactional context. The default REQUIRED behavior joins an existing transaction or creates a new one - suitable for most cases. Other propagation types handle specific scenarios like audit logging (REQUIRES_NEW) or read-only operations (SUPPORTS).

Type	Behavior
REQUIRED (default)	Join existing or create new
REQUIRES_NEW	Always create new, suspend existing
SUPPORTS	Join if exists, non-transactional otherwise
NOT_SUPPORTED	Suspend existing, run non-transactional
MANDATORY	Must have existing, throw if none
NEVER	Must not have existing, throw if present
NESTED	Nested transaction with savepoint

What is optimistic locking and how do you implement it?

Optimistic locking assumes conflicts are rare and doesn't acquire database locks upfront. Instead, it uses a version column to detect conflicts at commit time. When you update an entity, Hibernate includes the version in the WHERE clause. If another transaction modified the row (incrementing the version), the update affects zero rows, and Hibernate throws OptimisticLockException.

This approach is called "optimistic" because it optimistically assumes no one else will modify the data. It's ideal for web applications where users might view data for a while before submitting changes, and database locks would be impractical.

@Entity
public class Product {
    @Id
    private Long id;
 
    @Version  // Optimistic lock column
    private Long version;
 
    private String name;
    private BigDecimal price;
    private Integer quantity;
}

How the version check works:

// Transaction 1
Product p1 = em.find(Product.class, 1L);  // version = 5
p1.setQuantity(p1.getQuantity() - 1);
 
// Transaction 2 (concurrent)
Product p2 = em.find(Product.class, 1L);  // version = 5
p2.setQuantity(p2.getQuantity() - 1);
 
// Transaction 1 commits first
// SQL: UPDATE products SET quantity = ?, version = 6 WHERE id = 1 AND version = 5
// Success! Version incremented to 6
 
// Transaction 2 tries to commit
// SQL: UPDATE products SET quantity = ?, version = 6 WHERE id = 1 AND version = 5
// Fails! Version is now 6, not 5
// Throws OptimisticLockException

What is pessimistic locking and when should you use it?

Pessimistic locking acquires database locks immediately when loading entities, blocking other transactions from accessing the same rows. This approach is "pessimistic" because it assumes conflicts are likely and prevents them upfront rather than detecting them later.

Use pessimistic locking when conflicts are common (high contention for the same rows), when transactions are short (minimizing lock time), when conflict resolution would be expensive, or when operations must be serialized (like financial transactions). The downside is reduced concurrency - other transactions must wait.

// Lock for update - blocks other transactions
Product product = em.find(Product.class, 1L, LockModeType.PESSIMISTIC_WRITE);
// SQL: SELECT * FROM products WHERE id = 1 FOR UPDATE
 
// Other transactions block until this transaction commits
 
// With Spring Data JPA
@Lock(LockModeType.PESSIMISTIC_WRITE)
@Query("SELECT p FROM Product p WHERE p.id = :id")
Optional<Product> findByIdForUpdate(@Param("id") Long id);

Lock Mode	SQL	Use Case
PESSIMISTIC_READ	FOR SHARE	Allow concurrent reads, block writes
PESSIMISTIC_WRITE	FOR UPDATE	Block all access
PESSIMISTIC_FORCE_INCREMENT	FOR UPDATE + version increment	Force version update

When would you use pessimistic over optimistic locking?

The choice depends on your conflict expectations and business requirements. Optimistic locking is generally preferred for web applications because users often view data for extended periods before submitting changes, and holding database locks would be impractical.

Use pessimistic locking when:

High contention (many concurrent updates to same rows)
Short transactions where lock time is minimal
Conflict resolution is complex or expensive
Financial transactions requiring serialization

Use optimistic locking when:

Low contention (conflicts rare)
Long-running transactions (don't hold locks)
Web applications (users may abandon sessions)
Retry on conflict is acceptable

What are transaction isolation levels?

Isolation levels control the visibility of changes made by concurrent transactions. Higher isolation prevents more anomalies but reduces concurrency. Most applications use READ_COMMITTED (the default for most databases) and handle edge cases with explicit locking when needed.

Level	Dirty Reads	Non-Repeatable Reads	Phantom Reads
READ_UNCOMMITTED	Yes	Yes	Yes
READ_COMMITTED	No	Yes	Yes
REPEATABLE_READ	No	No	Yes
SERIALIZABLE	No	No	No

@Transactional(isolation = Isolation.READ_COMMITTED)
public void process() {
    // ...
}

Performance Optimization Questions

JPA can be fast or painfully slow. These optimizations make the difference between a responsive application and one that frustrates users.

How does the first-level cache (persistence context) work?

The first-level cache is the persistence context itself, and it's automatic - you don't need to configure it. Within a transaction, if you load the same entity twice, JPA returns the exact same object instance from memory rather than hitting the database again.

This caching ensures consistency within a transaction and improves performance by eliminating redundant queries. However, it can also cause memory issues if you load too many entities in a single transaction, as they all remain in the persistence context until it closes.

@Transactional
public void demonstrateFirstLevelCache() {
    User user1 = em.find(User.class, 1L);  // Database query
    User user2 = em.find(User.class, 1L);  // Cache hit, no query
 
    assert user1 == user2;  // Same instance
}

For large batch operations, periodically clear the persistence context to prevent memory issues:

// Problem: Loading 100k entities fills the persistence context
List<User> users = userRepository.findAll();  // 100k entities in memory
 
// Solution: Clear periodically
int batchSize = 1000;
for (int i = 0; i < users.size(); i++) {
    processUser(users.get(i));
    if (i % batchSize == 0) {
        em.flush();
        em.clear();  // Detach all entities, free memory
    }
}

What is the second-level cache and when should you use it?

The second-level cache is shared across sessions and transactions, unlike the first-level cache which is transaction-scoped. It can dramatically reduce database load for frequently-accessed, rarely-changed data. However, it adds complexity around cache invalidation and can cause stale data issues if not configured correctly.

Enable the second-level cache for read-mostly reference data like configuration, country lists, or product catalogs that don't change often. Avoid it for frequently-updated data or data that must always be fresh.

# Enable in configuration
spring.jpa.properties.hibernate.cache.use_second_level_cache=true
spring.jpa.properties.hibernate.cache.region.factory_class=org.hibernate.cache.jcache.JCacheRegionFactory

// Mark entity as cacheable
@Entity
@Cacheable
@Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Product {
    // ...
}

Strategy	Use Case
READ_ONLY	Immutable data (countries, config)
READ_WRITE	Read-mostly, occasional updates
NONSTRICT_READ_WRITE	Eventual consistency OK
TRANSACTIONAL	Full transaction support (JTA required)

How do you optimize batch inserts in JPA?

Batch inserts require several configurations working together. You need JDBC batching enabled, the right ID generation strategy (SEQUENCE, not IDENTITY), and code that periodically flushes and clears the persistence context to prevent memory exhaustion.

# Configuration
spring.jpa.properties.hibernate.jdbc.batch_size=50
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.order_updates=true

@Transactional
public void batchInsert(List<Product> products) {
    for (int i = 0; i < products.size(); i++) {
        em.persist(products.get(i));
        if (i % 50 == 0) {
            em.flush();
            em.clear();
        }
    }
}

For bulk updates where you don't need entity management, use JPQL UPDATE statements which bypass the persistence context entirely:

// JPA bulk update - much faster than loading entities
int updated = em.createQuery("""
    UPDATE Product p SET p.price = p.price * 1.1
    WHERE p.category = :category
    """)
    .setParameter("category", category)
    .executeUpdate();
 
// WARNING: Bypasses persistence context and cache
// Managed entities may have stale data
em.clear();  // Clear after bulk operations

What connection pool settings should you configure?

HikariCP is the default connection pool with Spring Boot and is highly performant with sensible defaults. However, you should tune the pool size based on your application's needs. Too few connections cause contention; too many waste resources and can overwhelm the database.

A common starting point is maximum-pool-size = (2 * CPU cores) + number of disks. Monitor your application's connection usage and adjust based on actual behavior.

spring.datasource.hikari.maximum-pool-size=10
spring.datasource.hikari.minimum-idle=5
spring.datasource.hikari.idle-timeout=300000
spring.datasource.hikari.connection-timeout=20000
spring.datasource.hikari.max-lifetime=1200000

Common Pitfalls Questions

These issues appear constantly in real applications and interviews. Understanding them demonstrates practical JPA experience.

What causes LazyInitializationException and how do you fix it?

LazyInitializationException occurs when you try to access a lazy-loaded relationship after the persistence context (session) has closed. This typically happens when a transaction ends and an entity becomes detached, but your code later tries to access a relationship that wasn't loaded.

@Transactional
public User getUser(Long id) {
    return userRepository.findById(id).orElseThrow();
}
 
// Calling code
User user = userService.getUser(1L);
// Transaction ended, session closed
 
user.getOrders().size();  // LazyInitializationException!
// Can't load orders - no session

There are several solutions, each appropriate for different situations:

// Solution 1: Fetch eagerly in the query
@Query("SELECT u FROM User u JOIN FETCH u.orders WHERE u.id = :id")
Optional<User> findByIdWithOrders(@Param("id") Long id);
 
// Solution 2: @EntityGraph
@EntityGraph(attributePaths = {"orders"})
Optional<User> findById(Long id);
 
// Solution 3: Keep transaction open longer (carefully!)
@Transactional(readOnly = true)
public UserDTO getUserWithOrders(Long id) {
    User user = userRepository.findById(id).orElseThrow();
    // Access orders within transaction
    return new UserDTO(user, user.getOrders());
}
 
// Solution 4: DTO projection (best for read-only)
@Query("SELECT new com.example.UserDTO(u.id, u.email) FROM User u WHERE u.id = :id")
Optional<UserDTO> findDtoById(@Param("id") Long id);

What cascade mistakes cause unexpected data loss?

Cascades can cause unexpected deletions if configured carelessly. Two common mistakes: using CascadeType.ALL on @ManyToOne relationships (deleting a child deletes the parent), and replacing a collection with orphanRemoval=true (old items get deleted).

// Problem: Unexpected deletions with orphanRemoval
@OneToMany(mappedBy = "user", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Order> orders;
 
user.setOrders(newOrdersList);  // Old orders deleted!
 
// Problem: CascadeType.REMOVE deletes more than expected
@ManyToOne(cascade = CascadeType.ALL)  // Don't cascade REMOVE on ManyToOne!
private User user;
 
orderRepository.delete(order);  // Deletes the User too!
 
// Best practice: Be explicit about cascades
@OneToMany(mappedBy = "user", cascade = {CascadeType.PERSIST, CascadeType.MERGE})
private List<Order> orders;

How should you implement equals() and hashCode() for entities?

Entities used in Sets or as Map keys need proper equals() and hashCode() implementations. Using the database ID seems natural but causes problems: the ID is null before persist, breaking Set behavior. Using a business key (a natural unique identifier) is more reliable.

@Entity
public class Order {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
 
    // WRONG: Using database ID
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Order)) return false;
        Order order = (Order) o;
        return Objects.equals(id, order.id);
    }
    // Problem: id is null before persist, breaks Sets
 
    // BETTER: Use business key
    @Column(unique = true)
    private String orderNumber;
 
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Order)) return false;
        Order order = (Order) o;
        return Objects.equals(orderNumber, order.orderNumber);
    }
 
    @Override
    public int hashCode() {
        return Objects.hash(orderNumber);
    }
}

How can merge() cause data loss in concurrent scenarios?

When you merge a detached entity, you're copying its state onto a managed entity. If another transaction modified the entity while yours was detached, merge overwrites those changes. This is a subtle race condition that many developers don't anticipate.

// Problem: Merge can lose changes
Order detached = getDetachedOrder();
detached.setStatus(OrderStatus.SHIPPED);
 
// Meanwhile, another transaction changed the order
Order managed = orderRepository.findById(detached.getId()).orElseThrow();
managed.setTotal(newTotal);
orderRepository.save(managed);
 
// Now merge the detached - overwrites the total change!
orderRepository.save(detached);  // merge() inside

The solution is to load fresh, modify specific fields, and let the managed entity track changes:

// Solution: Load, update specific fields
@Transactional
public void updateStatus(Long orderId, OrderStatus status) {
    Order order = orderRepository.findById(orderId).orElseThrow();
    order.setStatus(status);
    // Only status changes, other fields untouched
}

Quick Reference

Topic	Key Points
JPA vs Hibernate	JPA is spec, Hibernate is implementation
Entity States	NEW, MANAGED, DETACHED, REMOVED
Relationships	mappedBy on inverse side, set owning side
Fetching	Default LAZY, JOIN FETCH for N+1
Locking	Optimistic (@Version) vs Pessimistic (FOR UPDATE)
Caching	L1 automatic, L2 for read-mostly data
Performance	Batch size, projections, query optimization

Spring Boot Interview Guide - Spring Data JPA integration
Java Core Interview Guide - Java fundamentals
SQL Joins Interview Guide - Understanding the SQL JPA generates

Table of Contents