sql >> データベース >  >> RDS >> Mysql

過去X日間のレコード数をカウントするhybrid_methodを構築するにはどうすればよいですか?

    以下は(ほぼ)完全なコードスニペットです:

    # ... omitted import statements and session configuration
    
    def _date(date_str):
        return datetime.strptime(date_str, "%Y-%m-%d")
    
    
    class Match(Base):
        __tablename__ = "match"
    
        match_id = Column(Integer, primary_key=True)
        date = Column(Date, nullable=False)
    
        @hybrid_method
        def match_count(self, timespan_days):
            cut_off = self.date - timedelta(days=timespan_days)
            sess = object_session(self)
            M = Match
            q = (
                sess.query(M)
                # .filter(M.match_id != self.match_id)  # option-1: only other on the same day
                .filter(M.match_id < self.match_id)  # option-2: only smaller-id on the same day (as in OP)
                .filter(M.date <= self.date)
                .filter(M.date >= cut_off)
            )
            return q.count()
    
        @match_count.expression
        def match_count(cls, timespan_days):
            M = aliased(Match, name="other")
            cut_off = cls.date - timespan_days
            q = (
                select([func.count(M.match_id)])
                # .filter(Match.match_id != self.match_id)  # option-1: only other on the same day
                .where(M.match_id < cls.match_id)  # option-2: only smaller-id on the same day (as in OP)
                .where(M.date <= cls.date)
                .where(M.date >= cut_off)
            )
            return q.label("match_count")
    
    
    def test():
        Base.metadata.drop_all()
        Base.metadata.create_all()
    
        from sys import version_info as py_version
        from sqlalchemy import __version__ as sa_version
    
        print(f"PY version={py_version}")
        print(f"SA version={sa_version}")
        print(f"SA engine={engine.name}")
        print("=" * 80)
    
        # 1. test data
        matches = [
            Match(date=_date("2020-01-01")),
            Match(date=_date("2020-01-02")),
            Match(date=_date("2020-01-03")),
            Match(date=_date("2020-01-05")),
            Match(date=_date("2020-01-05")),
            Match(date=_date("2020-01-10")),
        ]
        session.add_all(matches)
        session.commit()
        print("=" * 80)
    
        # 2. test query in "in-memory"
        for m in session.query(Match):
            print(m, m.match_count(3))
        print("=" * 80)
    
        # 3. test query on "SQL"
        session.expunge_all()
        q = session.query(Match, Match.match_count(3))
        for match, match_count in q:
            print(match, match_count)
        print("=" * 80)
    
    
    if __name__ == "__main__":
        test()
    

    上記のコードは次の出力を生成します:

    ============================================================
    PY version=sys.version_info(major=3, minor=8, micro=1, releaselevel='final', serial=0)
    SA version=1.3.20
    SA engine=postgresql
    ============================================================
    <Match(date=datetime.date(2020, 1, 1), match_id=1)> 0
    <Match(date=datetime.date(2020, 1, 2), match_id=2)> 1
    <Match(date=datetime.date(2020, 1, 3), match_id=3)> 2
    <Match(date=datetime.date(2020, 1, 5), match_id=4)> 2
    <Match(date=datetime.date(2020, 1, 5), match_id=5)> 3
    <Match(date=datetime.date(2020, 1, 10), match_id=6)> 0
    ============================================================
    <Match(date=datetime.date(2020, 1, 1), match_id=1)> 0
    <Match(date=datetime.date(2020, 1, 2), match_id=2)> 1
    <Match(date=datetime.date(2020, 1, 3), match_id=3)> 2
    <Match(date=datetime.date(2020, 1, 5), match_id=4)> 2
    <Match(date=datetime.date(2020, 1, 5), match_id=5)> 3
    <Match(date=datetime.date(2020, 1, 10), match_id=6)> 0
    ============================================================
    

    一方、クエリq 以下のようになります(postgresql ):

    SELECT match.match_id,
           match.date,
    
      (SELECT count(other.match_id) AS count_1
       FROM match AS other
       WHERE other.match_id < match.match_id
         AND other.date <= match.date
         AND other.date >= match.date - %(date_1)s) AS match_count
    FROM match
    

    私が指摘したい1つの項目は、Matchごとにデータベースにクエリを実行する必要があるため、「メモリ内」チェックはあまり効率的ではないということです。 実例。したがって、可能であれば最後のクエリを使用します。




    1. Ubuntu 9.04(Jaunty)でMySQLリレーショナルデータベースを使用する

    2. データベース内のすべての外部キーを削除します(MySql)

    3. OracleDatabaseで仮想索引を使用する方法

    4. 同じクエリに対してMySQLWorkbenchはPythonよりもはるかに高速です