đź”™
PolyU Research Landscape & Collaboration Analysis (2020-2025)
Project Overview
- Objective:Â Analyze PolyU's research performance and global collaboration patterns (2020-2025) using Scopus and OpenAlex data.
- Deliverable:Â An interactive Power BI dashboard providing strategic insights into publication trends, citation impact, and key research areas etc.
Key Questions Addressed
- What are the trends in PolyU's publication volume and output types (articles, reviews, etc.) over the last 5 years?
- How has the citation impact (average citations, FWCI) of PolyU's research evolved? Which papers are most influential?
- What is the Open Access publishing rate, and how does it correlate with impact?
- What are the main research topics PolyU focuses on, and have these shifted?
- What are the patterns of research collaboration (internal, domestic HK, international)?
Data Sources
- Scopus API:Â Used initially to extract a comprehensive list of PolyU publications (DOIs, basic metadata) within the 2020-2025 timeframe and specific document types as baseline dataset.
- OpenAlex API: Used extensively via Python (pyalex library) to retrieve rich, detailed metadata for each publication identified via its DOI (Digital Object Identifier) extracted from SCOPUS
Implementation (ETL Process)
- Extract:Â Retrieved publication, citation count, author, institution, funder data from SCOPUS and OpenAlex Database for the past 5 years, 2020-2025, using SCOPUS and OpenAlex API through their Python library (pybliometrics and pyAlex).
- Transform: Processed complex API data using Python (pandas), normalizing it into a structured, relational format suitable for database upload, efficient analysis and large data volume management.
- Load:Â Loaded the normalized tables into a local Postgres Dtabase and connected Power BI directly for interactive reporting and visualization.
Database Schema (’Publications’ Table as Fact Table)
Remarks:
- Schema might take 10 seconds to load
- Hover over Tables to explore their relations. It’s interactive!
Data Visualization (2-page Power BI Report)
- Page 1: PolyU Research Profile
- Page 2: Research Output Analysis
Try interacting with the dashboard! Click directly on the data fields👆
Â
Future Enhancements
- Cloud Migration:Â Migrate database to the cloud (e.g. AWS Redshift Datawarehouse) for improved scalability, accessibility, and integration.
- Automated ETL:Â Implement an automated pipeline for periodic data updates (e.g., weekly or monthly) and auto-refreshed dashboards.
- Performance Tracking:Â Integrate Key Performance Indicators (KPIs) into the dashboard for strategic progress monitoring against goals and historical trends.