The public release repository for SUSTech SQL (CS307) course project 2.

Overview

CS307 Spring 2021 Database Project 2

1. Source code

Download link:

For java: https://github.com/NewbieOrange/SUSTech-SQL-Project2-Public

For python: https://github.com/ziqin/SUSTech-CS307-Project2-Python

Interface Specification

We provide the interfaces code in two language (Java, Python). You can use either of them upon your preference. Notice that due to the inherent differences between these two language, the Score Policy may also be different.

The structure of the interfaces is as follows.

  • database folder stores connection information such as username, password, url, we only provides PostgreSQL as the DBMS.
  • dto folder stores a set of data objects that will be accessed by interfaces. Your implementation will use them as parameters or returned values.
  • service folder stores Service Interfaces, this is the folder you should pay special attention to. There exist multiple .java file where the interface signatures are stored. You need to implement you own class to fit these signatures.
  • exception folder stores exceptions that you should throw if something went wrong.
  • factory folder stores the ServiceFactory abstract class that you need to implement to create your service instances.

Your Tasks

  • Implement the service and factory interfaces to pass the base testcases.
  • Design your (PostgreSQL) database to satisfy the requirements of interfaces.
  • Profile your implementation and find ways to speed it up.
  • (Optional) Find other ways to implement similar functionalities as our interfaces and compare (some of) them, are they better, worse or have different use cases.

Here is a reference implementation, it shows you how to implement one method of an interface. To get a service working, you'll have to implement all its interfaces

The following code is just a guide, the code interacts with database will usually be written in the DAO layer

@ParametersAreNonnullByDefault
public class ReferenceStudentService implements StudentService {
    /* Some codes are omitted */
    @Override
    public void dropCourse(int studentId, int sectionId) {
        try (Connection connection = SQLDataSource.getInstance().getSQLConnection();
            PreparedStatement stmt = connection.prepareStatement("call drop_course(?, ?)")) {
            stmt.setInt(1, studentId);
            stmt.setInt(2, sectionId);
            stmt.execute();
        } catch (SQLException e) {
            e.printStackTrace();
        }
    }
    /* Some codes are omitted */
}
public class ReferenceServiceFactory extends ServiceFactory {
    public ReferenceServiceFactory() {
        registerService(StudentService.class, new ReferenceStudentService());
        registerService(CourseService.class, new ReferenceCourseService());
        // registerService(<interface name>.class, new <your implementation>());
    }
}

After you have implemented your factory class, be sure to put your factory class name into the file ./config.properties. So that we can find your implementation and test.

serviceFactory=your.package.YourServiceFactory            // Your factory class name here.
jdbcUrl=jdbc:postgresql://localhost:5432/project2
username=postgres
password=postgres

Additional requirements of interface

Java

  • All add*() functions with int as return value should return the (presumably auto-generated) ID.
  • All arguments are guaranteed to be non-null, unless marked as @Nullable.
  • All return values (and their fields) should be non-null, unless explicitly documented otherwise. If a list/map is empty, put List.of()/Map.of() or equivalents instead of null.
  • Do NOT modify anything in the provided interfaces, or any of the framework code.
  • Your implementation should throw java.lang.UnsupportedOperationException if a method is not actually implemented, so the tests can fail quickly.

Python

  • All add*() functions with return value of int should return the (presumably auto-generated) ID.
  • All arguments are guaranteed to follow the type hints. Arguments passed won't be None unless explicitly annotated with Optional.
  • For return types with type hints, return values (and their fields) should not be None, unless explicitly documented. Use [], {}, set() or their equivalents instead of None for empty container in return values.
  • Your implementation should raise NotImplementedError if a method is not actually implemented, so that tests can fail quickly.

Rules

  • Data should be persisted on disk after each write operation instead of only modified in RAM. If you introduced a cache layer, you have to enforce the consistency. You should also ensure the durability in case of a sudden shutdown.
  • You should NOT use frameworks such as ORM.
  • You don't need to spend time on GUI/WEB, as we do NOT give extra scores for them.

Java-specific rules

  • You should NOT modify or add any class in package cn.edu.sustech.cs307. Use another package for your implementations.
  • You should NOT extend any class in package cn.edu.sustech.cs307.dto.
  • In this project, we use Maven to manage dependent libraries. If you want to introduce a new library, you need to record it in pom.xml. Your dependencies should be downloadable from the Maven Central repository.

Python-specific rules

TO BE ADDED

2. What to submit?

If you are using Java:

  • The entire Java repository (all source codes and configs, without the target folder).
  • The database dump from pg_dump without data in tables (by adding -s argument to pg_dump).
  • The jar file built using maven-assembly (Run IDEA-Maven-Plugins-assembly-single) in /target/xxxx-with-dependencies.jar.

If you are using Python:

3. What to deliver?

  • PASS BASE TEST: First and foremost, you should pass the base testcases, this is the basic requirement.

  • IMPROVE YOUR EFFICIENCY: After you passed the base tests, you need to find ways to improve the performance of your implementation. You can work on the following aspects.

    Resource Consumption

    • Memory Consumption: How much memory your database takes?
    • Disk Consumption: How much disk space your database takes? How are they distributed? (index, data, other info?)

    Speed

    • Data Import Speed: How much time your database need to import all data?
    • Data Modify Speed (Insertion, Update, Deletion): How much time your database need to change one/one hundred/one million rows?
    • Data Query Speed: How much time your database need to fetch one/one hundred/one million rows?
    • Cache Policy: How much time your database need to fetch a row if the row was just accessed by others?

    Concurrency

    • Simultaneous Query Number: How many queries can your database handles simultaneously?
    • Simultaneous Query Latency: How long does it take to query if there are many users connect to your database simultaneously.
    • Transaction Safety: Is your database safe with many users concurrently writing/reading to it?

    Correctness

    • Malformed Data Identification: Can your database identify the malformed data automatically?
    • ACID Principle
  • (Optional) DIFFERENT WAYS SAME GOAL? Can you find other ways to implement these functionalities? Are they BETTER/WORSE/USECASE-RELATED? Please do share us your amazing ideas.

Project Timeline

Code Submission Deadline: June 6th

Presentation Time: June 15th by Tencent Meeting

Comments
  • Questions about ignoreConflict in StudentService

    Questions about ignoreConflict in StudentService

    Can you explain what to do when ignoreConflict is true and there is a time or course conflict? More specifically, if ignoreConflict is true, the returned list of CourseSearchEntry should contain no conflicting sections or conflictCourseNames in CourseSearchEntry of each section contains nothing.

    Answered 
    opened by Ddduanxt 16
  • ReferenceStudent 中的enrollCourse的问题描述

    ReferenceStudent 中的enrollCourse的问题描述

    问题一: enrollcourse 中对于already_enrolled的描述是:The course (of the section) is already passed by the student 所以看的是course_id有无在选(也就是说先看正在选的这个section_id所在的course_id,再正在选的course_id对应的全部section_id,看有无已经选择),而不是section_id直接赋值的判断。但数据判断中看到的是section_id赋值的直接判断 问题二: 如果是section_id直接赋值的判断,那么enrollcourse 时直接enroll 同一个section_id 两次。第二次是判定ALREADY_ENROLLED呢?还是COURSE_CONFLICT_FOUND呢?答案是COURSE_CONFLICT_FOUND,但是我觉得ALREADY_ENROLLED更加贴切

    Invalid 
    opened by KingOfXi 12
  • benchmark.py 改用asyncio.gather()后运行hang停滞

    benchmark.py 改用asyncio.gather()后运行hang停滞

    前面import都没有问题,运行到testing search course1就一直停在那边。之前的benchmark.py没有这个问题 Import departments Import majors Import users Import semesters Import courses Import major courses Import time usage: 10.15s Testing search course 1

    Answered 
    opened by zion302 10
  • package wrong(java)

    package wrong(java)

    All the packages can't work normally. For example: package cn.edu.sustech.cs307.dto.prerequisite;

    If change the package to package main.java.cn.edu.sustech.cs307.dto.prerequisite; It will be ok.

    I wonder how to solve this problem. I think changing all the packages is not so good.....

    Invalid Question 
    opened by xbdeng 10
  • 关于错误的数据

    关于错误的数据

    在debug时发现了很多result里的错误答案。 例如,在searchcourse中,有给定的如下param: [11717372, 3, null, null, null, null, 9, null, ALL, false, false, false, true, 10, 0] 可以看到,这里的searchClassTime 为9 我们的程序给出了正确的对应classtime的答案: CourseSearchEntry{course=Course{id='RD472', name='计算机程序设计基础', credit=2, classHour=32, grading=HUNDRED_MARK_SCORE}, section=CourseSection{id=1406, name='中文班', totalCapacity=66, leftCapacity=66}, sectionClasses=[CourseSectionClass{id=2195, instructor=User{id=30000136, fullName='郑锋'}, dayOfWeek=THURSDAY, weekList=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], classBegin=9, classEnd=10, location='荔园6栋402机房'}], conflictCourseNames=[]}

    可是result给出了如下显然不满足classtime限制的答案: CourseSearchEntry{course=Course{id='RD472', name='计算机程序设计基础', credit=2, classHour=32, grading=HUNDRED_MARK_SCORE}, section=CourseSection{id=1406, name='中文班', totalCapacity=66, leftCapacity=66}, sectionClasses=[CourseSectionClass{id=2211, instructor=User{id=30000136, fullName='郑锋'}, dayOfWeek=THURSDAY, weekList=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], classBegin=5, classEnd=6, location='荔园1栋101'}, CourseSectionClass{id=2195, instructor=User{id=30000136, fullName='郑锋'}, dayOfWeek=THURSDAY, weekList=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], classBegin=9, classEnd=10, location='荔园6栋402机房'}], conflictCourseNames=[]}

    可见,result给出的答案包含一个5-6的class,显然错误。 这样的错误在数据中并不少,请问应该如何处理?

    Answered 
    opened by TANSixu 8
  • 关于enrollCourse中COURSE_CONFLICT_FOUND的疑问

    关于enrollCourse中COURSE_CONFLICT_FOUND的疑问

    我在测试enrollcourse的时候发现自己有一个样例过不去,报错信息为Enroll result error: EnrollResult.COURSE_CONFLICT_FOUND, expected: EnrollResult.SUCCESS

    在我进行具体查看的时候发现11713333这位同学先选了一个学期的CS205这门课,又尝试选另外一个学期的CS205,在这种情况下不是应该按照文档里所说的: /** * The student's enrolled courses has time conflicts with the section, * or has course conflicts (same course) with the section. */ COURSE_CONFLICT_FOUND,

    的第二种情况报错吗,但是答案确实SUCCESS 使用的是python

    Answered 
    opened by Antoniano1963 7
  • 数据库编码影响结果排序顺序

    数据库编码影响结果排序顺序

    在testSearchCourses1的第133个case中,expected第0项courseid为CS102A,fullName为计算机程序设计基础A[中英双语1班-实验5班],第5项courid为CS102A,fullName为计算机程序设计基础A[英文班-实验3班],但是按照文档要求,应该以courseid为第一关键字,fullName为第二关键字排序,这两项courseid相同,但第五项的fullName字典序小于第0项,按文档要求应是第五项排在前面 ,此处数据是否有误?

    Answered 
    opened by dengsh12 7
  • searchCourseMixedResult.json 数据问题

    searchCourseMixedResult.json 数据问题

    目前searchCourse1里1000个case就剩下40几个没有match的,都在 searchCourseMixed里面。举例其中已发现问题的一个: searchCourseMixed里下标为24的为: image location 含‘二教’ , dayofweek=Wednesday,courseType=Public。

    对应searchCourseMixedResult位置的search结果为: image 有两个sectionclass, 其中第二个没问题,都符合搜索的条件,但是第一个,dayofweek,地点都不符合。不明白为什么会是要求的搜索结果?还是我有啥地方没get到?

    Invalid 
    opened by zion302 6
  • 新数据中的CourseSection没有semesterId=2的

    新数据中的CourseSection没有semesterId=2的

    刚刚我下载了新数据并进行了测试,发现导入的CourseSection中没有semester_id=2的。 我在addCourseSection方法中加入输出语句,在输入的semester_id=2时输出course_id和section_name,但什么也没有输出来。 这也导致了后面searchCourse出现和expected不一致的情况。 20210606131059 是否是数据出了问题呢?

    Answered 
    opened by SpaceIshtar 6
  • 关于课程剩余容量的问题

    关于课程剩余容量的问题

    再添加course section的时候是把left capacity设置成和total capacity一样,然后再一个学生通过enrollCourse返回SUCCESS的时候同时将这个学生选上这门课以及left capacity-1 吗。那addEnrolledCourseWithGrade方法也需要将课程容量减去新添加的人数吗,以及这个方法是否需要判断section还有没有剩余容量

    Answered 
    opened by Ddduanxt 6
  • List<CourseSearchEntry> searchCourse

    List searchCourse

    There are some problems with this function.

    1. What is the searchName, should I use a Fuzzy query?

    2. searchCourseType If a student doesn't have a major, what is the searchCourseType.

    Invalid 
    opened by xbdeng 5
Owner
null
🚀flink-sql-submit is a custom SQL submission client

??flink-sql-submit is a custom SQL submission client This is a customizable extension of the client, unlike flink's official default client.

ccinn 3 Mar 28, 2022
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Trino is a fast distributed SQL query engine for big data analytics. See the User Manual for deployment instructions and end user documentation. Devel

Trino 6.9k Dec 31, 2022
requery - modern SQL based query & persistence for Java / Kotlin / Android

A light but powerful object mapping and SQL generator for Java/Kotlin/Android with RxJava and Java 8 support. Easily map to or create databases, perfo

requery 3.1k Jan 5, 2023
The official home of the Presto distributed SQL query engine for big data

Presto Presto is a distributed SQL query engine for big data. See the User Manual for deployment instructions and end user documentation. Requirements

Presto 14.3k Dec 30, 2022
jOOQ is the best way to write SQL in Java

jOOQ's reason for being - compared to JPA Java and SQL have come a long way. SQL is an "ancient", yet established and well-understood technology. Java

jOOQ Object Oriented Querying 5.3k Jan 4, 2023
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.

About CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time. CrateDB offers the

Crate.io 3.6k Jan 2, 2023
jdbi is designed to provide convenient tabular data access in Java; including templated SQL, parameterized and strongly typed queries, and Streams integration

The Jdbi library provides convenient, idiomatic access to relational databases in Java. Jdbi is built on top of JDBC. If your database has a JDBC driv

null 1.7k Dec 27, 2022
Java code generator for calling PL/SQL.

OBridge OBridge provides a simple Java source code generator for calling Oracle PL/SQL package procedures. Supported input, output parameters and retu

Ferenc Karsany 21 Oct 7, 2022
Persistent priority queue over sql

queue-over-sql This projects implement a persistent priority queue (or a worker queue) (like SQS, RabbitMQ and others) over sql. Why? There are some c

Shimon Magal 13 Aug 15, 2022
SQL tasarım komutları ve Backend yazıldı. Projeye yıldız Vermeyi Unutmayın 🚀 Teşekkürler! ❤️

HumanResourcesManagementSystem-HRMS SQL tasarım komutları ve Backend yazıldı. Projeye yıldız Vermeyi Unutmayın ?? Teşekkürler! ❤️ insan kaynakları yön

samet akca 7 Nov 6, 2022
SQL made uagliò.

GomorraSQL is an easy and straightforward interpreted SQL dialect that allows you to write simpler and more understandable queries in Neapolitan Langu

Donato Rimenti 1.1k Dec 22, 2022
Multi-DBMS SQL Benchmarking Framework via JDBC

BenchBase BenchBase (formerly OLTPBench) is a Multi-DBMS SQL Benchmarking Framework via JDBC. Table of Contents Quickstart Description Usage Guide Con

CMU Database Group 213 Dec 29, 2022
HasorDB is a Full-featured database access tool, Providing object mapping,Richer type handling than Mybatis, Dynamic SQL

HasorDB is a Full-featured database access tool, Providing object mapping,Richer type handling than Mybatis, Dynamic SQL, stored procedures, more dialect 20+, nested transactions, multiple data sources, conditional constructors, INSERT strategies, multiple statements/multiple results. And compatible with Spring and MyBatis usage.

赵永春 17 Oct 27, 2022
An open source SQL database designed to process time series data, faster

English | 简体中文 | العربية QuestDB QuestDB is a high-performance, open-source SQL database for applications in financial services, IoT, machine learning

QuestDB 9.9k Jan 1, 2023
Free universal database tool and SQL client

DBeaver Free multi-platform database tool for developers, SQL programmers, database administrators and analysts. Supports any database which has JDBC

DBeaver 29.8k Jan 1, 2023
Simple but powerful SQL library for tiny projects

KiORM - Simple and powerful MySQL Java library for tiny projects Notice: KiORM is still in SNAPSHOT state. The code is not tested, there is no Javadoc

Rikonardo 2 Sep 13, 2022
A Java library to query pictures with SQL-like language

PicSQL A Java library to query pictures with SQL-like language. Features : Select and manipulate pixels of pictures in your disk with SQL-like dialect

Olivier Cavadenti 16 Dec 25, 2022