GORM 核心功能的实现原理 - zkqiang's blog - JOYK Joy of Geek, Geek News, Link all geek

GORM 核心功能的实现原理

zkqiang's blog

GORM 核_

2022年4月9日上午

11k 字

90 分钟

282 次

GORM 是 Golang 在数据库操作上经常使用到的 ORM 库，相较于同类其他 ORM 库，该库支持的功能更丰富，更新也较为活跃。

本文是基于 GORM V2（版本号: v1.23.5），对核心功能的源码进行探究。

官方推荐的连接数据库方法是 gorm.Open，该方法接收至少两个参数。

func Open(dialector Dialector, opts ...Option) (db *DB, err error) {
	// ...
}

Dialector 是一个接口类型，GORM 会调用这些接口方法，来构建对应数据库能运行的 SQL 语句：

type Dialector interface {
	Name() string
	Initialize(*DB) error
	Migrator(db *DB) Migrator
	DataTypeOf(*schema.Field) string
	DefaultValueOf(*schema.Field) clause.Expression
	BindVarTo(writer clause.Writer, stmt *Statement, v interface{})
	QuoteTo(clause.Writer, string)
	Explain(sql string, vars ...interface{}) string
}

由于不同的数据库都有自己的方言（Dialects），例如分页的语句，在 MySQL 中是 LIMIT 10，而 Oracle 则是 FETCH NEXT 10 ROWS ONLY 。因此不同的数据库需要实现自己的方法，通过相同的接口方法去构建出对应的 SQL 语句。这也是 ORM 库的主要功能之一，可以通过抽象代码兼容不同的数据库。

GORM 官方支持 SQLite、MySQL、Postgres、SQLServer，其他数据库可以自己实现，或使用其他开发者的开源库。

Dialector 主要部分是实现 Initialize 方法：

func (dialector Dialector) Initialize(db *gorm.DB) (err error) {

callbacks.RegisterDefaultCallbacks(db, &callbacks.Config{})
	db.Callback().Create().Replace("gorm:create", Create)
	db.Callback().Update().Replace("gorm:update", Update)

// ...
}

通常在该方法中，需要注册回调函数，GORM 在执行增删改查等操作过程中或过程前后，会调用这些回调函数，因此你也可以实现自己的方法去替换它们。

生成 SQL 语句

前面提到的 gorm.Open 方法会返回一个 gorm.DB 结构体指针：

type DB struct {
	*Config
	Error        error
	RowsAffected int64
	Statement    *Statement
	clone        int
}

这个结构体的实例经常用到，Table Select Where 等一系列链式方法都是这个结构体指针的方法。

实际上链式方法构成的 SQL 片段和参数，会被存储到 Statement 这个字段中，这个字段的结构体将贯穿整个链式调用的过程：

type Statement struct {
	*DB
	TableExpr            *clause.Expr
	Table                string
	Model                interface{}
	Unscoped             bool
	Dest                 interface{}
	ReflectValue         reflect.Value
	Clauses              map[string]clause.Clause
	BuildClauses         []string
	Distinct             bool
	Selects              []string // selected columns
	Omits                []string // omit columns
	Joins                []join
	Preloads             map[string][]interface{}
	Settings             sync.Map
	ConnPool             ConnPool
	Schema               *schema.Schema
	Context              context.Context
	RaiseErrorOnNotFound bool
	SkipHooks            bool
	SQL                  strings.Builder
	Vars                 []interface{}
	CurDestIndex         int
	attrs                []interface{}
	assigns              []interface{}
	scopes               []func(*DB) *DB
}

可以看到里面包含了诸如 Table Selects Joins 等属性，例如 Select 方法中就是把查询的字段名存入 Selects 里：

func (db *DB) Select(query interface{}, args ...interface{}) (tx *DB) {
	tx = db.getInstance()

switch v := query.(type) {
	case []string:
		tx.Statement.Selects = v

for _, arg := range args {
			switch arg := arg.(type) {
			case string:
				tx.Statement.Selects = append(tx.Statement.Selects, arg)
			case []string:
				tx.Statement.Selects = append(tx.Statement.Selects, arg...)
			default:
				tx.AddError(fmt.Errorf("unsupported select args %v %v", query, args))
				return
			}
		}

if clause, ok := tx.Statement.Clauses["SELECT"]; ok {
			clause.Expression = nil
			tx.Statement.Clauses["SELECT"] = clause
		}
	// ...
	}

return
}

而当调用 First Find Scan Create Update 等这些 Finisher 方法时，执行过程中就会把 Statement 存储的内容读取出来，构建出最终的 SQL 语句并被执行。

例如调用 Find 方法：

func (db *DB) Find(dest interface{}, conds ...interface{}) (tx *DB) {
	tx = db.getInstance()
	if len(conds) > 0 {
		if exprs := tx.Statement.BuildCondition(conds[0], conds[1:]...); len(exprs) > 0 {
			tx.Statement.AddClause(clause.Where{Exprs: exprs})
		}
	}
	tx.Statement.Dest = dest
	return tx.callbacks.Query().Execute(tx)
}

其中最后的 tx.callbacks.Query() 是返回对应的查询处理器（query processor），而处理器的 Execute 方法中会对 Statement 进行最后的整理，如对参数进行预处理和校验。p.fns 是个列表，包含了一系列回调函数，即是之前初始化中通过 callbacks.RegisterDefaultCallbacks 方法注册的回调：

func (p *processor) Execute(db *DB) *DB {
	// ...
	for _, f := range p.fns {
		f(db)
	}
	// ...
}

// 默认注册的查询回调
func Query(db *gorm.DB) {
	if db.Error == nil {
		BuildQuerySQL(db)

if !db.DryRun && db.Error == nil {
			rows, err := db.Statement.ConnPool.QueryContext(db.Statement.Context, db.Statement.SQL.String(), db.Statement.Vars...)
			if err != nil {
				db.AddError(err)
				return
			}
			gorm.Scan(rows, db, 0)
			db.AddError(rows.Close())
		}
	}
}

其中会通过 BuildQuerySQL 构建出查询 SQL 语句，然后通过连接池在数据库进行执行，并将结果通过 gorm.Scan 填充到前面 Find(dest) 中。

GORM 支持了常见的数据表关联关系：一对一、一对多、多对一、多对多，这些关系会保存在如下的结构体实例中：

type Relationships struct {
	HasOne    []*Relationship
	BelongsTo []*Relationship
	HasMany   []*Relationship
	Many2Many []*Relationship
	Relations map[string]*Relationship
}

type Relationship struct {
	Name                     string
	Type                     RelationshipType
	Field                    *Field
	Polymorphic              *Polymorphic
	References               []*Reference
	Schema                   *Schema
	FieldSchema              *Schema
	JoinTable                *Schema
	foreignKeys, primaryKeys []string
}

GORM 在操作中会扫描 Modal 的嵌套结构体，以及 foreignkey 和 references 这些 Tag 去解析它们之间的关系，并构成 Relationship 进行保存：

func (schema *Schema) parseRelation(field *Field) *Relationship {
	// ...

if schema.err == nil {
		schema.Relationships.Relations[relation.Name] = relation
		switch relation.Type {
		case HasOne:
			schema.Relationships.HasOne = append(schema.Relationships.HasOne, relation)
		case HasMany:
			schema.Relationships.HasMany = append(schema.Relationships.HasMany, relation)
		case BelongsTo:
			schema.Relationships.BelongsTo = append(schema.Relationships.BelongsTo, relation)
		case Many2Many:
			schema.Relationships.Many2Many = append(schema.Relationships.Many2Many, relation)
		}
	}

// ...
}

在增删改查的回调中，都包含了对关联关系的处理，下面通过关联创建来举例：

user := User{
  Name:            "jinzhu",
  Emails:          []Email{
    {Email: "[email protected]"},
    {Email: "[email protected]"},
  },
}

db.Create(&user)
// BEGIN TRANSACTION;
// INSERT INTO "users" (name,billing_address_id,shipping_address_id) VALUES ("jinzhu", 1, 2);
// INSERT INTO "emails" (user_id,email) VALUES (111, "[email protected]"), (111, "[email protected]") ON DUPLICATE KEY DO NOTHING;
// COMMIT;

该例中通过创建 User 会关联创建内嵌的 Email，可以看出这是个一对多的关系。其中 INSERT INTO "emails" 的 SQL 语句是通过 gorm:save_after_associations 注册的回调函数执行出来的，默认的回调会遍历 Relationships 所有的成员列表，然后将对应关联的 Model 对象进行保存：

func SaveAfterAssociations(create bool) func(db *gorm.DB) {
	return func(db *gorm.DB) {
		// ...

for _, rel := range db.Statement.Schema.Relationships.HasMany {
				// ...

if elems.Len() > 0 {
					assignmentColumns := make([]string, 0, len(rel.References))
					for _, ref := range rel.References {
						assignmentColumns = append(assignmentColumns, ref.ForeignKey.DBName)
					}

saveAssociations(db, rel, elems, selectColumns, restricted, assignmentColumns)
				}

// ...
			}

// ...
	}
}

func saveAssociations(db *gorm.DB, rel *schema.Relationship, rValues reflect.Value, selectColumns map[string]bool, restricted bool, defaultUpdatingColumns []string) error {
	// ...
	return db.AddError(tx.Create(values).Error)
}

在事务方面常用的 Transaction 方法代码如下：

func (db *DB) Transaction(fc func(tx *DB) error, opts ...*sql.TxOptions) (err error) {
	panicked := true

if committer, ok := db.Statement.ConnPool.(TxCommitter); ok && committer != nil {
		// nested transaction
		if !db.DisableNestedTransaction {
			err = db.SavePoint(fmt.Sprintf("sp%p", fc)).Error
			if err != nil {
				return
			}

defer func() {
				// Make sure to rollback when panic, Block error or Commit error
				if panicked || err != nil {
					db.RollbackTo(fmt.Sprintf("sp%p", fc))
				}
			}()
		}
		err = fc(db.Session(&Session{NewDB: db.clone == 1}))
	} else {
		tx := db.Begin(opts...)
		if tx.Error != nil {
			return tx.Error
		}

defer func() {
			// Make sure to rollback when panic, Block error or Commit error
			if panicked || err != nil {
				tx.Rollback()
			}
		}()

if err = fc(tx); err == nil {
			panicked = false
			return tx.Commit().Error
		}
	}

panicked = false
	return
}

可以看到实际就是封装了 Begin 和 Commit 等方法，并且会先判断是否为内嵌事务对象，如果是则通过 SavePoint 来阶段性提交。

GORM 内置了一些 Tag 来控制字段级别的权限，例如只读、只写、只创建、只更新或者被忽略：

type User struct {
  Name string `gorm:"<-:create"` // allow read and create
  Name string `gorm:"<-:update"` // allow read and update
  Name string `gorm:"<-"`        // allow read and write (create and update)
  Name string `gorm:"<-:false"`  // allow read, disable write permission
  Name string `gorm:"->"`        // readonly (disable write permission unless it configured)
  Name string `gorm:"->;<-:create"` // allow read and create
  Name string `gorm:"->:false;<-:create"` // createonly (disabled read from db)
  Name string `gorm:"-"`            // ignore this field when write and read with struct
  Name string `gorm:"-:all"`        // ignore this field when write, read and migrate with struct
  Name string `gorm:"-:migration"`  // ignore this field when migrate with struct
}

这些 Tag 会在 Model 解析时跟随字段就一起被解析，信息被保存在 Schema 结构体内嵌的 Field 结构体中：

type Schema struct {
	Name                      string
	ModelType                 reflect.Type
	Table                     string
	PrioritizedPrimaryField   *Field
	DBNames                   []string
	PrimaryFields             []*Field
	PrimaryFieldDBNames       []string
	Fields                    []*Field
	FieldsByName              map[string]*Field
	FieldsByDBName            map[string]*Field
	FieldsWithDefaultDBValue  []*Field
	Relationships             Relationships
    // ...
}

type Field struct {
	Name                   string
	DBName                 string
	PrimaryKey             bool
	AutoIncrement          bool
	AutoIncrementIncrement int64
	Creatable              bool
	Updatable              bool
	Readable               bool
	IgnoreMigration        bool
	HasDefaultValue        bool
	DefaultValue           string
	DefaultValueInterface  interface{}
	NotNull                bool
	Unique                 bool
	Tag                    reflect.StructTag
	TagSettings            map[string]string
	// ...
}

可以看到 Field 包含了 Creatable Updatable Readable IgnoreMigration 这些布尔类型的属性，它们会在解析过程中根据 Tag 被赋值：

func (schema *Schema) ParseField(fieldStruct reflect.StructField) *Field {
	// ...

if val, ok := field.TagSettings["-"]; ok {
		val = strings.ToLower(strings.TrimSpace(val))
		switch val {
		case "-":
			field.Creatable = false
			field.Updatable = false
			field.Readable = false
			field.DataType = ""
		case "all":
			field.Creatable = false
			field.Updatable = false
			field.Readable = false
			field.DataType = ""
			field.IgnoreMigration = true
		case "migration":
			field.IgnoreMigration = true
		}
	}

if v, ok := field.TagSettings["->"]; ok {
		field.Creatable = false
		field.Updatable = false
		if strings.ToLower(v) == "false" {
			field.Readable = false
		} else {
			field.Readable = true
		}
	}

if v, ok := field.TagSettings["<-"]; ok {
		field.Creatable = true
		field.Updatable = true

if v != "<-" {
			if !strings.Contains(v, "create") {
				field.Creatable = false
			}

if !strings.Contains(v, "update") {
				field.Updatable = false
			}
		}
	}

// ...
}

然后创建、更新等操作中会判断这些属性，达到控制字段权限的目的。例如 AutoMigrate 方法中会有建表的调用，如果某字段的 IgnoreMigration 为 true，则会跳过这个字段：

func (m Migrator) CreateTable(values ...interface{}) error {
	// ...

for _, dbName := range stmt.Schema.DBNames {
				field := stmt.Schema.FieldsByDBName[dbName]
				if !field.IgnoreMigration {
					createTableSQL += "? ?"
					hasPrimaryKeyInDataType = hasPrimaryKeyInDataType || strings.Contains(strings.ToUpper(string(field.DataType)), "PRIMARY KEY")
					values = append(values, clause.Column{Name: dbName}, m.DB.Migrator().FullDataTypeOf(field))
					createTableSQL += ","
				}
			}

// ...
}

GORM 核心功能的实现原理 - zkqiang's blog

生成 SQL 语句

Recommend

基层医疗检测产品及服务提供商伊鸿健康完成数千万元新一轮融资，昆仲资本领投

欧雷说：「交通拥堵问题很多时候是车距控制问题，影响因素主要有人的反应时间、操作技...

SAP AppGyver x Firebase①　Google Firebaseとの連携

OnePlus Nord 2T India Launch Timeline Tipped, See Expected Price

使用面向对象和功能性方法来重构Java应用

四年虚增营业收入超33亿 ST新研拟遭证监会行政处罚

明明有方向盘，它为什么自称汽车机器人？

Investors await May CPI data release on Friday

荣耀发布最新环保进展，绿色行动更进一步

Supercat.app

About Joyk