0%
March 18, 2025

Mongo in "JPA": spring-boot-starter-data-mongodb

jpa

mongo

springboot

Dependencies

implementation("org.springframework.boot:spring-boot-starter-data-mongodb")

Define a Document Class (Namely, a Collection)

1import org.springframework.data.annotation.Id
2import org.springframework.data.mongodb.core.mapping.Document
3import java.io.Serializable
4
5enum class LLMDefaultLanguage(val code: String) {
6    EN("en"),
7    TC("tc");
8}
9
10@Document(collection = "llmsummaries")
11data class LLMSummary(
12    @Id
13    val id: String? = null,
14    val isDeleted: Boolean?,
15    val defaultLanguage: String?,
16    val messagesSessionId: String?,
17    val messagesSessionType: String?,
18    val timeZone: String?,
19    val success: Boolean?,
20    val time_cost: Float?,
21    val version: String?,
22    val model_meta: List<ModelMeta>?,
23    val result: List<LLMResult>?,
24    val zhResult: List<LLMResult>?,
25    val createdAt: String?,
26    val updatedAt: String?
27) {

Here the @Id-annotated field will be translated into _id: ObjectID (into db) and into id: String (out of db) automatically.

28    data class LLMResult(
29        var issueId: String?,
30        var summaryUuid: String?,
31        val groupID: Int?,
32        val lang: String?,
33        val endTime: String?,
34        val startTime: String?,
35        val startTimestamp: Float?,
36        val endTimestamp: Float?,
37        val originalScripts: List<OriginalScript>? = listOf(),
38        val imgUrls: List<String>?,
39        val keyPoints: List<String>?,
40        val summary: String?,
41        val title: String?,
42        val priority: OnOffProperty?,
43        val sentiment: OnOffProperty?,
44        val is_identity: Boolean?,
45        val impact: List<OnOffProperty>?,
46        val speechAct: List<OnOffProperty>?,
47        val topic: List<OnOffProperty>?
48    ) {
49        data class OriginalScript(
50            val time: Float,
51            val name: String,
52            val text: String
53        )
54    }
55
56    data class OnOffProperty(
57        var name: String? = null,
58        var on: Boolean? = null,
59        @Field("id")
60        var id: String? = null,
61    ) : Serializable

id is a reserveed keyword for spring-data-mongo. Without otherwise specified, id field will be saved as _id in mongo document. To enforce the id-naming, we need to use @Field("id").

63    data class ModelMeta(
64        val model_name: String,
65        val token_count: TokenCount
66    ) {
67        data class TokenCount(
68            val input_tokens: Int,
69            val output_tokens: Int,
70            val total_tokens: Int
71        )
72    }
73}

Define a Repository

Auto Generated Queries and Custom Queries

By defining a class above explicitly, now we can enjoy the auto-completed repository method within the repository method:

1import com.wonderbricks.billie.commons.mongo.LLMSummary
2import org.bson.types.ObjectId
3import org.springframework.data.mongodb.repository.MongoRepository
4import org.springframework.data.mongodb.repository.Query
5import java.util.stream.Stream
6
7interface LLMSummaryMongoRepository : MongoRepository<LLMSummary, ObjectId> {
8    fun findAllByResultSummaryUuidIn(resultSummaryUuids: List<String>): MutableList<LLMSummary>
9
10    @Query(sort = "{ '_id': -1 }",
11           value = "{ \$or: [{ 'result.summaryUuid' : ?0 }, { 'zhResult.summaryUuid' : ?0 } ] }")
12    fun findByResultSummaryUuid(uuid: String?): Stream<LLMSummary>?
13
14    fun findByMessagesSessionIdIn(sessionIds: List<String>): List<LLMSummary>
15
16    fun findByMessagesSessionId(sessionId: String): LLMSummary?

Remarks on @Query highlighted above:

  • Unlike jpa the positional argument starts from ?0 (while in jpa we starts from ?1).

  • Unlike jpa the use of @Param for named parameter is not supported by default (special config in application.yml is needed).

Complex Update Aggregation Pipeline

For complex query we can write custom filter using @Query and custom aggregation pipeline using @Update.

17    @Query("""
18    {
19        'result': { ${'$'}exists: true },
20        'result.topic': { ${'$'}exists: true }
21    }
22    """)
23    @Update("""
24        {
25            '${'$'}pull': {
26                'result.${'$'}[].topic': {
27                    'id': { ${'$'}in: ?0 }
28                }
29            }
30        }
31    """)
32    fun removeEnResultsTopicByTagIds(tagIds: List<String>)
33}

Note that the @Update part can be an object or an array of objects, as in the native updateMany API in javascript world:

db.collection.updateMany({ filter }, [
  { $set: { field: expression } },
  { $unset: ["fieldToRemove"] },
]);

Indexing

To add an index in a nested field, we add the following right below @Document.

@CompoundIndex(
    def = "{'result.topic.id': 1}",
    name = "topic_id_idx"
)

Here 1 means that we add an index in ascending order (adding index is like adding a sortable keys).